Talk

Regularization by Misclassification in ReLU Neural Networks

  • Ido Nachum (École Polytechnique Fédérale de Lausanne)
Live Stream

Abstract

We highlight a property of feedforward neural networks trained with SGD for classification tasks. An SGD update for a misclassified data point decreases the Frobenius norm of the weight matrix of each layer. This holds for networks of any architecture and activations such as ReLU and Leaky ReLU.This may explain why in practice the performance of neural networks can be similar with or without L2 regularization. We then use this insight to study cases for which a ReLU network is presented with random or noisy labels. On the one hand, using completely random labels eliminate neurons and can nullify the network completely. On the other hand, with some label noise, we observe cases with considerable improvement in performance. That is, some label noise can decrease the test error of a network in an artificial example. Also, in an experiment with the MNIST dataset, we show that training with noisy labels performs comparably to standard training and yields a sparse over-complete basis for MNIST.

Links

seminar
22.05.25 05.06.25

Math Machine Learning seminar MPI MIS + UCLA Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

Upcoming Events of this Seminar