Abstract for the talk on 16.12.2021 (17:00 h)

Math Machine Learning seminar MPI MIS + UCLA

Zhi-Qin John Xu (Shanghai Jiao Tong University)
Occam’s razors in neural networks: frequency principle in training and embedding principle of loss landscape
16.12.2021, 17:00 h, only video broadcast

I would demonstrate that a neural network (NN) learns training data as simple as it can, resembling an implicit Occam’s Razor, from the following two viewpoints. First, the NN output often follows a frequency principle, i.e., learning data from low to high frequency. Second, we prove an embedding principle that the loss landscape of a NN "contains" all the critical points of all the narrower NNs. The embedding principle provides a basis for the condensation phenomenon, i.e., the NN weights condense on isolated directions when initialized small, which means the effective NN size is much smaller than its actual size, i.e., a simple representation of the training data.


Zhi-Qin John Xu is an associate professor at Shanghai Jiao Tong University (SJTU). Zhi-Qin obtain B.S. in Physics (2012) and a Ph.D. degree in Mathematics (2016) from SJTU. Before joining SJTU, Zhi-Qin worked as a postdoc at NYUAD and Courant Institute from 2016 to 2019.

If you want to participate in this video broadcast please register using this special form. The (Zoom) link for the video broadcast will be sent to your email address one day before the seminar.

18.11.2021, 11:41