Search

Talk

The Description Length of Deep Learning Models

  • Léonard Blier (Facebook AI Research, Université Paris Saclay, Inria)
Live Stream

Abstract

Solomonoff's general theory of inference and the Minimum Description Length principle formalize Occam's razor, and hold that a good model of data is a model that is good at losslessly compressing the data, including the cost of describing the model itself. This theory gives a very interesting viewpoint on many known results in statistics and machine learning. But the success of Deep Learning seems to go against this theory: while deep neural networks are often the best models in practice, they are also extremely complex, in the sense that they are hard to compress. We solve this paradox and demonstrate experimentally the ability of deep neural networks to compress the training data even when accounting for parameter encoding.

seminar
5/2/24 5/16/24

Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

Katharina Matschke

MPI for Mathematics in the Sciences Contact via Mail

Upcoming Events of This Seminar