Relative flatness and generalization
- Henning Petzka (RWTH Aachen University)
Flatness of the loss curve is conjectured to be connected to the generalization ability of machine learning models, in particular neural networks. While empirical evidence consistently shows a strong correlation between flatness measures and generalization, the theoretical explanation for this connection remains an open problem. Our recent paper, "Relative Flatness and Generalization", proposes an explanation for this relationship by linking it to a model's robustness under feature perturbations and a measure of how representative the training data is of its underlying distribution. We establish a necessary condition for the connection between flatness and generalization to hold.
In this talk, I will provide an overview of our paper's results, focusing on the motivations, insights, and limitations of our approach.