Neural Network Loss Landscapes in High Dimensions: Theory Meets Practice

Stanislav Fort (Stanford University)

Live Stream

Abstract

Deep neural networks trained with gradient descent have been extremely successful at learning solutions to a broad suite of difficult problems across a wide range of domains such as vision, gameplay, and natural language, many of which had previously been considered to require intelligence. Despite their tremendous success, we still do not have a detailed, predictive understanding of how these systems work. In this talk, I will focus on recent efforts to understand the structure of deep neural network loss landscapes and how gradient descent navigates them during training. In particular, I will discuss a phenomenological approach to modelling their large-scale structure [1], the role of their nonlinear nature in the early phases of training [2], and its effects on ensembling and calibration. [3,4]

[1] Stanislav Fort, and Stanislaw Jastrzebski. “Large Scale Structure of Neural Network Loss Landscapes.” Advances in Neural Information Processing Systems 32 (NeurIPS 2019). arXiv 1906.04724

[2] Stanislav Fort et al. "Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel". NeurIPS 2020. arXiv 2010.15110

[3] Stanislav Fort, Huiyi Hu, Balaji Lakshminarayanan. "Deep Ensembles: A Loss Landscape Perspective." arXiv 1912.02757

[4] Marton Havasi et al. "Training independent subnetworks for robust prediction". ICLR 2021. arXiv 2010.06610

seminar

03.07.25 02.10.25

Math Machine Learning seminar MPI MIS + UCLA Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

See Details

Upcoming Events of this Seminar

Thursday, 03.07.25 On the Power of Context-Enhanced Learning in LLMs with Xingyu Zhu
Thursday, 10.07.25 The effect of low rank and stochasticity on Gradient Descent at the Edge of Stability with Avrajit Ghosh a.o.
Thursday, 14.08.25 to be announced with Jonathan Siegel
Thursday, 02.10.25 to be announced with Marcello Carioni