Abstract for the talk on 22.10.2020 (17:00 h)Math Machine Learning seminar MPI MIS + UCLA
Boris Hanin (Princeton University)
Neural Networks at Finite Width and Large Depth
Deep neural networks are often considered to be complicated "black boxes," for which a full systematic analysis is not only out of reach but also impossible. In this talk, which is based on ongoing joint work with Sho Yaida and Daniel Adam Roberts, I will make the opposite claim. Namely, that deep neural networks with random weights and biases are perturbatively solvable models. Our approach applies to networks at finite width n and large depth L, the regime in which they are used in practice. A key point will be the emergence of a notion of "criticality," which involves a finetuning of model parameters (weight and bias variances). At criticality, neural networks are particularly well-behaved but still exhibit a tension between large values for n and L, with larger values of n tending to make neural networks more like Gaussian processes and large values of L amplifying higher cumulants. Our analysis at initialization has many consequences also for networks during after training, which I will discuss if time permits.