The separation capacity of random neural networks

Sjoerd Dirksen (Utrecht University)

Live Stream

Abstract

Neural networks with random weights appear in a variety of machine learning applications, most prominently as the initialization of many deep learning algorithms and as a computationally cheap alternative to fully learned neural networks. The first goal of this talk is to enhance the theoretical understanding of random neural networks by addressing the following data separation problem: under which conditions can a random neural network make two classes (with positive distance) linearly separable? I will show that a sufficiently large two-layer ReLU-network with Gaussian weights and uniformly distributed biases can solve this problem with high probability. Building on the insights behind this result, I will next present a simple randomized algorithm to produce a small interpolating neural net for a given dataset with two classes. In both results, the size of the network is explicitly linked to geometric properties of the two classes and their mutual arrangement. This instance-specific viewpoint allows to overcome the curse of dimensionality and to prove memorization results beyond worst case lower bounds.

Links

seminar

07.08.25 09.10.25

Math Machine Learning seminar MPI MIS + UCLA Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

Details anzeigen

Upcoming Events of this Seminar

Donnerstag, 07.08.25 Efficient compression of neural networks and datasets with Lukas Barth
Donnerstag, 14.08.25 to be announced with Jonathan Siegel
Donnerstag, 21.08.25 to be announced with Zhou Fan
Donnerstag, 28.08.25 to be announced with Randall Balestriero
Donnerstag, 02.10.25 to be announced with Marcello Carioni
Donnerstag, 09.10.25 to be announced with Baharan Mirzasoleiman