Abstract for the talk on 21.08.2020 (17:00 h)Math Machine Learning seminar MPI MIS + UCLA
Preetum Nakkiran (Harvard University)
Distributional Generalization: A New Kind of Generalization
See the video of this talk.
We initiate the study of a new set of empirical properties of interpolating classifiers, including neural networks, kernel machines, and decision trees. These properties constitute a new type of generalization, which extends the classical notion of generalization. This "Distribution Generalization" property roughly states that the test and train outputs of interpolating classifiers are close *as distributions*, as opposed to close in just their error. This captures, for example, the following behavior— If we mislabel 30 percent of dogs as cats in the train set of CIFAR-10, then a ResNet trained to zero train error will in fact mislabel roughly 30 percent of dogs as cats on the test set as well, while leaving other classes unaffected. We generalize this observation via several formal conjectures, which we test across realistic domains. These properties form a more fine-grained understanding of interpolating classifiers, beyond just their test error, and lay an alternate framework for reasoning about their generalization.
Joint work with Yamini Bansal.