On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
- Itay Safran (Purdue University)
Abstract
We will study the dynamics of Gradient Flow when training a depth 2 ReLU network on univariate data in a binary classification setting. Assuming our data is labelled by a teacher network with width , we will show a convergence guarantee that holds if the learned student network is sufficiently over-parameterized. Moreover, we will show that the implicit bias in our setting is such that the student network converges to a network with at most linear regions, which results in an intuitive generalization bound. Overall, this provides an end-to-end learnability result in this setting, demonstrating the interplay between optimization and generalization: Over-parameterization facilitates optimization, while the implicit bias prevents overfitting.