Exact sparse reconstruction and stability for infinite wide shallow neural networks
- Marcello Carioni (University of Twente)
Abstract
In this talk, I will present recent results regarding sparsity properties of infinite wide shallow neural networks. It is well-known that the training of shallow neural networks can be convexified by lifting the weights space to measures defined on a sphere. This gives rise to a convex optimization problem on measures, that, depending on the chosen activation function, exhibits different degrees of smoothness and sparsity. While representer theorems have been established, showing the existence of at least one sparse solution, much less is known regarding the exact reconstruction and the sparse stability properties of minimizers. After reviewing the results available in the literature, I will focus on the ReLU activation functions, demonstrating that, in this case, the minimizers are always sparse. Specifically, these minimizers are composed of finite linear combinations of Dirac deltas, whose support is determined by the behavior of the dual-certificate. Next, I will discuss the stability of sparsity for minimizers with respect to labels perturbations and modifications of the regularization parameter. I will show that, for small perturbations and under suitable assumptions on the input, the perturbed minimizer remains sparse. Additionally, it is possible to estimate the locations of the Dirac deltas based on the perturbation level. Such results are achieved by a careful analysis of the dual-certificate of the minimizer for Dirac deltas that either belong to the decision boundary or lie in the interior of a decision region determined by the inputs.