Distributional Generalization: A New Kind of Generalization

Preetum Nakkiran (Harvard University)

Live Stream

Abstract

We initiate the study of a new set of empirical properties of interpolating classifiers, including neural networks, kernel machines, and decision trees. These properties constitute a new type of generalization, which extends the classical notion of generalization. This "Distribution Generalization" property roughly states that the test and train outputs of interpolating classifiers are close *as distributions*, as opposed to close in just their error. This captures, for example, the following behavior— If we mislabel 30 percent of dogs as cats in the train set of CIFAR-10, then a ResNet trained to zero train error will in fact mislabel roughly 30 percent of dogs as cats on the test set as well, while leaving other classes unaffected. We generalize this observation via several formal conjectures, which we test across realistic domains. These properties form a more fine-grained understanding of interpolating classifiers, beyond just their test error, and lay an alternate framework for reasoning about their generalization.

Joint work with Yamini Bansal.

Links

seminar

19.06.25 02.10.25

Math Machine Learning seminar MPI MIS + UCLA Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

See Details

Upcoming Events of this Seminar

Thursday, 19.06.25 Reimaging Gradient Descent: Large Stepsize, Oscillation, and Acceleration with Jingfeng Wu
Thursday, 03.07.25 On the Power of Context-Enhanced Learning in LLMs with Xingyu Zhu
Thursday, 10.07.25 The effect of low rank and stochasticity on Gradient Descent at the Edge of Stability with Avrajit Ghosh a.o.
Thursday, 14.08.25 to be announced with Jonathan Siegel
Thursday, 02.10.25 to be announced with Marcello Carioni