Search

Talk

The Large Learning Rate Phase of Wide, Deep Neural Networks

  • Yasaman Bahri (Google Brain)
Live Stream

Abstract

Recent investigations into infinitely-wide deep neural networks have given rise to intriguing connections between deep networks, kernel methods, and Gaussian processes. Nonetheless, there are important dynamical regimes for finite-width neural networks that lie far outside the realm of applicability of these results. I will discuss how the choice of learning rate in gradient descent is a crucial factor that naturally classifies gradient descent dynamics of deep nets into two classes (a “lazy” regime and a “catapult” regime). These phases are separated by a sharp phase transition as deep networks become wider. I will describe the distinct phenomenological signatures of the two phases, how they are elucidated in a class of solvable simple models, and the implications for model performance.

Links

seminar
5/2/24 5/16/24

Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

Katharina Matschke

MPI for Mathematics in the Sciences Contact via Mail

Upcoming Events of This Seminar