Many modern machine learning problems are solved using complex over-parametrized models on large datasets and require scalable algorithms with low computation and memory complexity. This often leads to use of non-convex methods which do not necessarily have good computational/ statistical foundations.
In this talk, we will explore the role of simple local search algorithms, such as stochastic gradient descent, in optimizing non-convex functions, and leverage this to design efficient algorithms for a large and important class of machine learning problems: those involving the fitting of low-rank matrices to data. Later we will discuss a surprising role these local search methods play in implicitly regularizing the complexity of the over-parametrized models in problems involving both low rank matrices and deep neural networks.
His research is primarily focused on designing statistically efficient algorithms for large scale machine learning problems. He is interested in matrix and tensor factorization, non-convex optimization, neural networks and sub-linear time algorithms.