Expected Complexity and Gradients of Maxout Networks

Hanna Tseran (MPI MiS, Leipzig)

E1 05 (Leibniz-Saal)

Abstract

Learning with neural networks relies on the complexity of the representable functions but, more importantly, the particular assignment of typical parameters to functions of different complexity. Taking the number of activation regions as a complexity measure, we show that the practical complexity of networks with maxout activation functions, which correspond to tropical rational maps, is often far from the theoretical maximum. Continuing the analysis of the expected behavior, we study the expected gradients of a maxout network with respect to inputs and parameters and obtain bounds for the moments depending on the architecture and the parameter distribution. Based on this, we formulate parameter initialization strategies that avoid vanishing and exploding gradients in wide networks.

seminar

07.08.25 30.10.25

Seminar on Nonlinear Algebra Seminar on Nonlinear Algebra

MPI for Mathematics in the Sciences G3 10 (Lecture hall)

Details anzeigen

Upcoming Events of this Seminar

Donnerstag, 07.08.25 Equivariant noetherianity: theory and applications with Teresa Yu
Donnerstag, 07.08.25 Enumerative and tropical aspects of Brill Noether loci with Nathan Pflueger
Dienstag, 19.08.25 Pick’s formula and Castelnuovo polytopes with Takayuki Hibi
Dienstag, 19.08.25 tba with Benjamin Hollering
Donnerstag, 04.09.25 tba with Mateusz Michalek
Donnerstag, 18.09.25 tba with Niharika Paul
Donnerstag, 02.10.25 The variety of orthogonal frames with Alessio Sammartano
Donnerstag, 30.10.25 Open linear maps and Gibbs varieties with Stephan Weis