log in  |  register  |  feedback?  |  help  |  web accessibility
PhD Defense: Efficient Optimization Algorithms for Nonconvex Machine Learning Problems
Wenhan Xian
Thursday, July 11, 2024, 2:00-3:00 pm Calendar
  • You are subscribed to this talk through .
  • You are watching this talk through .
  • You are subscribed to this talk. (unsubscribe, watch)
  • You are watching this talk. (unwatch, subscribe)
  • You are not subscribed to this talk. (watch, subscribe)


In recent years, the success of the AI revolution has led to the training of larger neural networks over vast amounts of data to achieve superior performance. These powerful machine learning models have enabled the creation of remarkable AI products. Optimization, as the core of machine learning, becomes especially crucial because most machine learning problems can ultimately be formulated as optimization problems, which require minimizing a loss function with respect to model parameters based on training samples.

To enhance the efficiency of optimization algorithms, distributed learning has emerged as a popular solution for addressing large-scale machine learning tasks. In distributed learning, multiple worker nodes collaborate to train a global model. However, a key challenge in distributed learning is the communication cost. This thesis introduces a novel adaptive gradient algorithm with gradient sparsification to address this issue.

Another significant challenge in distributed learning is the communication overhead on the central parameter server. To mitigate this bottleneck, decentralized distributed (serverless) learning has been proposed, where each worker node only needs to communicate with its neighbors. This thesis investigates core nonconvex optimization problems in decentralized settings, including constrained optimization, minimax optimization, and second-order optimality. Efficient optimization algorithms are proposed to solve these problems respectively.

Additionally, the convergence analysis of minimax optimization under the generalized smooth condition is explored. A generalized algorithm is proposed, which can be applied to a broader range of applications.


Wenhan Xian is a PhD candidate at the Department of Computer Science, University of Maryland College Park, advised by Dr. Heng Huang. His research focuses on developing faster optimization algorithms for machine learning problems and investigating their theoretical analysis. His research area includes but not limited to nonconvex optimization, minimax optimization, distributed learning and federated learning. Wenhan has published many papers in top tier machine learning conferences, including NeurIPS, ICML etc.

This talk is organized by Migo Gui