Talks

PhD Proposal: PhD Preliminary: Understanding and Enhancing Machine Learning Models with Theoretical Foundations

Zhengmian Hu

IRB-5105 Brendan Iribe Center for Computer Science and Engineering (IRB)

Friday, February 2, 2024, 10:00 am-12:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

I will present a study on neural network kernel distributions, for understanding their performance and scalability. Specifically, I investigate the distributions of Conjugate Kernel (CK) and Neural Tangent Kernel (NTK) for ReLU networks that are subjected to random initialization. Through rigorous analysis, I have derived precise distributions of the diagonal elements of these kernels. For a feedforward network, these values converge in law to a log-normal distribution when the network depth d and width n simultaneously tend to infinity and the variance of log diagonal elements is proportional to d/n. For the residual network, in the limit that number of branches m increases to infinity and the width n remains fixed, the diagonal elements of Conjugate Kernel converge in law to a log-normal distribution where the variance of log value is proportional to 1/n, and the diagonal elements of NTK converge in law to a log-normal distributed variable times the conjugate kernel of one feedforward network. These novel theoretical findings suggest that residual networks can remain trainable even in the limit of infinite branching and constant network width. The numerical experiments are conducted and all results validate the soundness of our theoretical analysis.

Bio

Zhengmian Hu is a PhD student at the Department of Computer Science, University of Maryland College Park, advised by Dr. Heng Huang. His research involves the intersection of optimization in machine learning, deep learning theory, and probabilistic methods, aimed at improving our understanding of artificial intelligence. Zhengmian has published many papers in top tier machine learning conferences, including ICML, NeurIPS, ICLR etc.

This talk is organized by Migo Gui