Conservation Laws for Gradient Flows

  • Rémi Gribonval (University Lyon)
Live Stream


Understanding the geometric properties of gradient descent dynamics is a key ingredient in deciphering the recent success of very large machine learning models. A striking observation is that trained over-parameterized models retain some properties of the optimization initialization. This “implicit bias” is believed to be responsible for some favorable properties of the trained models and could explain their good generalization properties. In this work, we expose the definitions and properties of "conservation laws", that define quantities conserved during gradient flows of a given machine learning model, such as a ReLU network, with any training data and any loss. After explaining how to find the maximal number of independent conservation laws via Lie algebra computations, we provide algorithms to compute a family of polynomial laws, as well as to compute the number of (not necessarily polynomial) conservation laws. We obtain that on a number of architecture there are no more laws than the known ones, and we identify new laws for certain flows with momentum and/or non-Euclidean geometries.

Joint work with Sibylle Marcotte and Gabriel Peyré.

23.05.24 13.06.24

Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

Katharina Matschke

MPI for Mathematics in the Sciences Contact via Mail

Upcoming Events of this Seminar