Relative flatness and generalization

Henning Petzka (RWTH Aachen University)

Live Stream

Abstract

Flatness of the loss curve is conjectured to be connected to the generalization ability of machine learning models, in particular neural networks. While empirical evidence consistently shows a strong correlation between flatness measures and generalization, the theoretical explanation for this connection remains an open problem. Our recent paper, "Relative Flatness and Generalization", proposes an explanation for this relationship by linking it to a model's robustness under feature perturbations and a measure of how representative the training data is of its underlying distribution. We establish a necessary condition for the connection between flatness and generalization to hold.

In this talk, I will provide an overview of our paper's results, focusing on the motivations, insights, and limitations of our approach.

Links

seminar

19.06.25 02.10.25

Math Machine Learning seminar MPI MIS + UCLA Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

See Details

Upcoming Events of this Seminar

Thursday, 19.06.25 Reimaging Gradient Descent: Large Stepsize, Oscillation, and Acceleration with Jingfeng Wu
Thursday, 03.07.25 On the Power of Context-Enhanced Learning in LLMs with Xingyu Zhu
Thursday, 10.07.25 The effect of low rank and stochasticity on Gradient Descent at the Edge of Stability with Avrajit Ghosh a.o.
Thursday, 14.08.25 to be announced with Jonathan Siegel
Thursday, 02.10.25 to be announced with Marcello Carioni