Abstract for the talk on 03.03.2022 (17:00 h)

Math Machine Learning seminar MPI MIS + UCLA

Marco Mondelli (Institute of Science and Technology Austria)
Gradient Descent for Deep Neural Networks: New Perspectives from Mean-field and NTK
See the slides of this talk.

Understanding the properties of neural networks trained via gradient descent is at the heart of the theory of deep learning. In this talk, I will discuss two approaches to study the behavior of gradient descent methods. The first one takes a mean-field view and it relates the dynamics of stochastic gradient descent (SGD) to a certain Wasserstein gradient flow in probability space. I will show how this idea allows to study the connectivity, convergence and implicit bias of the solutions found by SGD. The second approach consists in the analysis of the Neural Tangent Kernel. I will present tight bounds on its smallest eigenvalue and show their implications on memorization and optimization in deep networks.

Based on joint work with Adel Javanmard, Vyacheslav Kungurtsev, Andrea Montanari, Guido Montufar, Quynh Nguyen, and Alexander Shevchenko.


10.06.2022, 13:08