Abstract for the talk on 20.08.2020 (17:00 h)Math Machine Learning seminar MPI MIS + UCLA
Arthur Jacot (École Polytechnique Fédérale de Lausanne)
Neural Tangent Kernel: Convergence and Generalization of DNNs
See the video of this talk.
Modern deep learning has popularized the use of very large neural networks, but the theoretical tools to study such networks are still lacking. The Neural Tangent Kernel (NTK) describes how the output neutrons evolve during training allowing a precise description of the convergence and generalization of DNNs. In the infinite width limit (when the number of hidden neutrons grows to infinity) the NTK converges to a deterministic and fixed limit ensuring convergence to a global minimum. In particular, training a DNN with the MSE loss corresponds to doing Kernel Ridge Regression (KRR) with the NTK. Under the assumption of i.i.d input samples, the risk of KRR can be approximated using the Signal Capture Threshold (SCT), which identifies which principal components of the signal are learned. A further approximation leads to the Kernel Alignement Risk Estimator (KARE), which predicts the test error of KRR from the train data only.