Abstract for the talk on 04.09.2020 (17:00 h)Math Machine Learning seminar MPI MIS + UCLA
Noam Razin (Tel Aviv University)
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
04.09.2020, 17:00 h, only video broadcast
Mathematically characterizing the implicit regularization induced by gradient-based optimization is a longstanding pursuit in the theory of deep learning. A widespread hope is that a characterization based on minimization of norms may apply, and a standard test-bed for studying this prospect is matrix factorization (matrix completion via linear neural networks). It is an open question whether norms can explain the implicit regularization in matrix factorization. In this talk I will resolve this open question in the negative, showing that there exist natural matrix factorization problems on which the implicit regularization drives all norms (and quasi-norms) towards infinity. Our results suggest that, rather than perceiving the implicit regularization via norms, a potentially more useful interpretation is minimization of rank. I will present empirical evidence that this interpretation extends to a certain class of non-linear neural networks, suggesting that it may be key to explaining generalization in deep learning.