On the importance of higher-order input statistics for learning in neural networks
- Sebastian Goldt (SISSA Trieste)
Abstract
Neural networks are powerful feature extractors - but which features do they extract from their data? And how does the structure of the training data shape the representations they learn? We discuss two recent works in this direction. We first develop a simple model for images and show that a neural network trained on these images can learn a convolution from scratch [1]. This pattern-formation process is driven by a combination of translation-invariance of the "images" and the non-Gaussian, higher-order statistics of the inputs. Second, we conjecture a "distributional simplicity bias" whereby neural networks learn increasingly complex distributions of their inputs during training. We present analytical and experimental evidence for this conjecture, going from a simple perceptron up to deep ResNets [2].
References:
[1] Ingrosso & Goldt, PNAS 119 (40), e2201854119, arXiv:2202.00565
[2] Refinetti, Ingrosso & Goldt, in preparation.