Abstract for the talk on 21.05.2020 (17:00 h)Math Machine Learning seminar MPI MIS + UCLA
Dennis Elbrächter (Universität Wien)
How degenerate is the parametrization of (ReLU) neural networks?
See the video of this talk.
Neural network training is usually accomplished by solving a non-convex optimization problem using stochastic gradient descent. Although one optimizes over the networks parameters, the loss function (up to regularization terms) generally only depends on the realization of the neural network, i.e. the function it computes. We discuss how studying the optimization problem over the space of realizations may open up new ways to understand neural network training, if one manages to overcome the difficulties caused by the redundancies and degeneracies of how neural networks are parametrized.