Abstract for the talk on 18.09.2020 (17:00 h)Math Machine Learning seminar MPI MIS + UCLA
Mert Pilanci (Stanford University)
Exact Polynomial-time Convex Formulations for Training Multilayer Neural Networks: The Hidden Convex Optimization Landscape
In this talk, we introduce exact convex optimization formulations of multilayer neural network training problems with rectified linear units. Our theory leverages semi-infinite duality, gauge functions and minimum norm regularization. We show that two and three layer neural networks can be globally trained via a convex program with number of variables polynomial in the number of training samples and number of hidden neurons. To our knowledge, this is the first polynomial time algorithm that provably optimizes non-trivial neural network architectures in the literature. Our results provide an equivalent characterization of neural network models in terms of a hidden convex optimization landscape with convex regularization where the data matrix is partitioned over hyperplane arrangements. As a result, non-convex ReLU neural networks models can be precisely represented as convex regularization methods, where piecewise linear models are fitted via group L1 norm penalty. Moreover, we show that certain standard two and three layer convolutional neural networks can be globally optimized in fully polynomial time. In particular, our results reveal a hidden regularization mechanism behind convolutional nets and characterizes how the architecture and pooling strategies alter the regularizer. Finally, we present numerical simulations showing that standard local search heuristics such as stochastic gradient descent can be dramatically inefficient compared to the proposed convex program.