On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias

Itay Safran (Purdue University)

Live Stream

Abstract

We will study the dynamics of Gradient Flow when training a depth 2 ReLU network on univariate data in a binary classification setting. Assuming our data is labelled by a teacher network with width , we will show a convergence guarantee that holds if the learned student network is sufficiently over-parameterized. Moreover, we will show that the implicit bias in our setting is such that the student network converges to a network with at most linear regions, which results in an intuitive generalization bound. Overall, this provides an end-to-end learnability result in this setting, demonstrating the interplay between optimization and generalization: Over-parameterization facilitates optimization, while the implicit bias prevents overfitting.

Links

seminar

03.07.25 02.10.25

Math Machine Learning seminar MPI MIS + UCLA Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

See Details

Upcoming Events of this Seminar

Thursday, 03.07.25 On the Power of Context-Enhanced Learning in LLMs with Xingyu Zhu
Thursday, 10.07.25 The effect of low rank and stochasticity on Gradient Descent at the Edge of Stability with Avrajit Ghosh a.o.
Thursday, 14.08.25 to be announced with Jonathan Siegel
Thursday, 02.10.25 to be announced with Marcello Carioni