More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize
- Wei Hu (University of California, Berkeley)
Abstract
Understanding how deep neural networks generalize remains notoriously challenging in theory. This talk will motivate and examine a simpler question, that is, the generalization of high-dimensional linear regression models, using several empirical high-performing real-world neural-network-induced settings as a testbed (e.g. the empirical NTK of a pretrained ResNet applied to CIFAR-100). We find that, perhaps surprisingly, even in these linear settings, most existing theoretical analyses for linear/kernel regression fail to qualitatively capture the empirical generalization phenomena. On the other hand, a random matrix theory hypothesis gives rise to an estimator that accurately predicts generalization. Based on arxiv.org/abs/2203.06176 (ICML 2022).
Bio: Wei Hu is a postdoc at UC Berkeley and an incoming assistant professor at the University of Michigan. He obtained his Ph.D. from Princeton University advised by Sanjeev Arora, and his bachelor's degree from Tsinghua University. He is interested in the theoretical and scientific foundations of modern machine learning, with a particular focus on deep learning.