Mildly Overparameterized ReLU Networks Have a Favorable Loss Landscape

Guido Montúfar

E1 05 (Leibniz-Saal)

Abstract

We study the loss landscape of two-layer mildly overparameterized ReLU neural networks on a generic finite input dataset for the squared error loss. Our approach involves bounding the dimension of the sets of local and global minima using the rank of the Jacobian of the parametrization map. Using results on random binary matrices, we show most activation patterns correspond to parameter regions with no bad differentiable local minima. Furthermore, for one-dimensional input data, we show most activation regions realizable by the network contain a high dimensional set of global minima and no bad local minima. We experimentally confirm these results by finding a phase transition from most regions having full rank to many regions having deficient rank depending on the amount of overparameterization.

This is work with Kedar Karhadkar, Michael Murray, Hanna Tseran.

Bio: Guido Montúfar is an Associate Professor of Mathematics and Statistics & Data Science at UCLA as well as the Leader of the Math Machine Learning Group at the Max Planck Institute for Mathematics in the Sciences. His research focuses on deep learning theory and more generally mathematical aspects of machine learning. He studied mathematics and physics at TU Berlin, obtained the Dr.rer.nat. in 2012 as an IMPRS fellow in Leipzig, and held postdoc positions at PennState and MPI MiS. Guido Montufar is a 2022 Alfred P. Sloan Research Fellow.

www.math.ucla.edu/~montufar/

Workshop on Geometry and Machine Learning Workshop on Geometry and Machine Learning

MPI für Mathematik in den Naturwissenschaften Leipzig E1 05 (Leibniz-Saal)

See Details

Katharina Matschke

Max Planck Institute for Mathematics in the Sciences Contact via Mail

Samantha Fairchild

Max Planck Institute for Mathematics in the Sciences

Diaaeldin Taha

Max Planck Institute for Mathematics in the Sciences

Anna Wienhard

Max Planck Institute for Mathematics in the Sciences

Mildly Overparameterized ReLU Networks Have a Favorable Loss Landscape

Abstract

Links

Workshop on Geometry and Machine Learning Workshop on Geometry and Machine Learning

Katharina Matschke

Samantha Fairchild

Diaaeldin Taha

Anna Wienhard