Search

Workshop

Global convergence of gradient descent for some non-convex learning problem.

  • Francis Bach (INRIA)
E1 05 (Leibniz-Saal)

Abstract

Many tasks in machine learning and signal processing can be solved by minimizing a convex function of a measure. This includes sparse spikes deconvolution or training a neural network with a single hidden layer. For these problems, we study a simple minimization method: the unknown measure is discretized into a mixture of particles and a continuous-time gradient descent is performed on their weights and positions. This is an idealization of the usual way to train neural networks with a large hidden layer. We show that, when initialized correctly and in the many-particle limit, this gradient flow, although non-convex, converges to global minimizers (Joint work with Lénaïc Chizat)

Valeria Hünniger

Max Planck Institute for Mathematics in the Sciences Contact via Mail

Jörg Lehnert

Max Planck Institute for Mathematics in the Sciences Contact via Mail

Jürgen Jost

Max Planck Institute for Mathematics in the Sciences

Felix Otto

Max Planck Institute for Mathematics in the Sciences

Bernd Sturmfels

Max Planck Institute for Mathematics in the Sciences