Geometry-induced Implicit Regularization in Deep ReLU Neural Networks

  • Joachim Bona-Pellissier (MaLGa Universita di Genova)
Live Stream


Neural networks with a large number of parameters do not overfit due to implicit regularization favoring good networks. This challenges the adequacy of parameter count as a measure of complexity. To understand implicit regularization, we study the output set geometry. Its local dimension, called batch functional dimension, varies. The batch functional dimension is almost surely determined by hidden layer activation patterns and remains invariant to network parameterization symmetries. Empirical evidence shows that the batch functional dimension decreases during optimization, when computed on training inputs and test inputs, resulting in parameters with low batch functional dimensions. We call this phenomenon geometry-induced implicit regularization. The batch functional dimension depends on both network parameters and inputs. We investigate the maximum achievable batch functional dimension under varying inputs for fixed parameters. We demonstrate that this quantity, termed computable full functional dimension, remains invariant to network parameterization symmetries and is determined by the achievable activation patterns. In addition, we present a sampling theorem that highlights the rapid convergence in the estimation of the computable full functional dimension. Empirical findings suggest that it is close to the number of parameters, which is related to the local identifiability of the parameters.


4/25/24 5/16/24

Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

Katharina Matschke

MPI for Mathematics in the Sciences Contact via Mail

Upcoming Events of This Seminar