Search

Workshop

On the Expected Complexity of Maxout Networks

  • Guido Montufar (UCLA / MPI MIS, Los Angeles / Leipzig, USA)
E1 05 (Leibniz-Saal)

Abstract

Learning with neural networks relies on the complexity of the representable functions, but more importantly, the particular assignment of typical parameters to functions of different complexity. Taking the number of activation regions as a complexity measure, recent works have shown that the practical complexity of deep ReLU networks is often far from the theoretical maximum. In this work we show that this phenomenon also occurs in networks with maxout (multi-argument) activation functions and when considering the decision boundaries in classification tasks. We also show that the parameter space has a multitude of full-dimensional regions with widely different complexity, and obtain nontrivial lower bounds on the expected complexity. Finally, we investigate different parameter initialization procedures and show that they can increase the speed of convergence in training. The linear regions of the functions represented by a maxout network correspond to the faces of polytopes obtained by building convex hulls and Minkowski sums of polytopes. Hence these results can be interpreted as statements about the expected number of faces of certain compositions of random polytopes.

This is joint work with Hanna Tseran arxiv.org/abs/2107.00379

Saskia Gutzschebauch

Max Planck Institute for Mathematics in the Sciences Contact via Mail

Paul Breiding

Max Planck Institute for Mathematics in the Sciences