Institute
General information about the institute, such as our mission statement, organizational structure, staff directory, history, directions, etc.

See More
Research
Scientific profile with all research groups, topics, collaborations, as well as columns on research at the institute.

See More
News
News and press releases about the institute, as well as a media archive.

See More
- News Overview
- Press Releases
Events
Overview of all events around the institute, such as talks, seminars, lectures, workshops, conferences and public events.

See More
Publications
Overview of all scientific publications of the institute, as well as our preprint and software repositories.

See More
Career
Information on open positions at the institute, benefits of working with us, graduate school, and postdoctoral supervision.

See More

Talk

Thursday, April 4, 2024 17:00

Geometry-induced Implicit Regularization in Deep ReLU Neural Networks

Joachim Bona-Pellissier (MaLGa Universita di Genova)

Live Stream

Abstract

Neural networks with a large number of parameters do not overfit due to implicit regularization favoring good networks. This challenges the adequacy of parameter count as a measure of complexity. To understand implicit regularization, we study the output set geometry. Its local dimension, called batch functional dimension, varies. The batch functional dimension is almost surely determined by hidden layer activation patterns and remains invariant to network parameterization symmetries. Empirical evidence shows that the batch functional dimension decreases during optimization, when computed on training inputs and test inputs, resulting in parameters with low batch functional dimensions. We call this phenomenon geometry-induced implicit regularization. The batch functional dimension depends on both network parameters and inputs. We investigate the maximum achievable batch functional dimension under varying inputs for fixed parameters. We demonstrate that this quantity, termed computable full functional dimension, remains invariant to network parameterization symmetries and is determined by the achievable activation patterns. In addition, we present a sampling theorem that highlights the rapid convergence in the estimation of the computable full functional dimension. Empirical findings suggest that it is close to the number of parameters, which is related to the local identifiability of the parameters.

Links

seminar

5/2/24 5/16/24

Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

See Details

Katharina Matschke

MPI for Mathematics in the Sciences Contact via Mail

Upcoming Events of This Seminar

May 2, 2024 Early Alignment in two-layer networks training is a two edge sword with Etienne Boursier
May 16, 2024 to be announced with Rémi Gribonval