Abstract for the talk on 15.04.2021 (17:00 h)

Math Machine Learning seminar MPI MIS + UCLA

Stanislav Fort (Stanford University)
Neural Network Loss Landscapes in High Dimensions: Theory Meets Practice
15.04.2021, 17:00 h, only video broadcast

Deep neural networks trained with gradient descent have been extremely successful at learning solutions to a broad suite of difficult problems across a wide range of domains such as vision, gameplay, and natural language, many of which had previously been considered to require intelligence. Despite their tremendous success, we still do not have a detailed, predictive understanding of how these systems work. In this talk, I will focus on recent efforts to understand the structure of deep neural network loss landscapes and how gradient descent navigates them during training. In particular, I will discuss a phenomenological approach to modelling their large-scale structure [1], the role of their nonlinear nature in the early phases of training [2], and its effects on ensembling and calibration. [3,4]

[1] Stanislav Fort, and Stanislaw Jastrzebski. “Large Scale Structure of Neural Network Loss Landscapes.” Advances in Neural Information Processing Systems 32 (NeurIPS 2019). arXiv 1906.04724

[2] Stanislav Fort et al. "Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel". NeurIPS 2020. arXiv 2010.15110

[3] Stanislav Fort, Huiyi Hu, Balaji Lakshminarayanan. "Deep Ensembles: A Loss Landscape Perspective." arXiv 1912.02757

[4] Marton Havasi et al. "Training independent subnetworks for robust prediction". ICLR 2021. arXiv 2010.06610

If you want to participate in this video broadcast please register using this special form. The (Zoom) link for the video broadcast will be sent to your email address one day before the seminar.

08.04.2021, 16:19