Search

Talk

Exploring Deep Neural Collapse via Extended and Penalized Unconstrained Features Models

  • Tom Tirer (Bar-Ilan University)
Live Stream

Abstract

Training deep neural networks for classification often includes minimizing the training loss beyond the zero training error point. In this phase of training, a "neural collapse" (NC) behavior has been empirically observed: the variability of features (outputs of the penultimate layer) of within-class samples decreases and the mean features of different classes approach a certain tight frame structure. Recent papers have shown that minimizers with this structure emerge when optimizing a simplified "unconstrained features model" (UFM) with a regularized cross-entropy loss. In this talk, I will start with a review on the empirical findings on the NC phenomenon. Then, I will present some of our theoretical results (from our publications at ICML22 and ICML23). Namely, we show that the minimizers of a UFM exhibit (a somewhat more delicate) NC structure also with regularized MSE loss, and we analyze the gradient flow in this case, identifying the different dynamics of the features' within- and between-class covariance matrices. Then, we extend the UFM in two different ways to provide mathematical reasoning for the depthwise empirical NC behavior and the effect of regularization hyperparameters on the closeness to collapse. We support our theory with experiments in practical deep learning settings.

Links

seminar
5/2/24 5/16/24

Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

Katharina Matschke

MPI for Mathematics in the Sciences Contact via Mail

Upcoming Events of This Seminar