Abstract for the talk on 01.12.2022 (17:00 h)Math Machine Learning seminar MPI MIS + UCLA
Levon Nurbekyan (UCLA)
Efficient natural gradient method for large-scale optimization problems
See the video of this talk.
Large-scale optimization is at the forefront of modern data science, scientific computing, and applied mathematics with areas of interest, including high-dimensional PDE, inverse problems, machine learning, etc. First-order methods are workhorses for large-scale optimization due to modest computational cost and simplicity of implementation. Nevertheless, these methods are often agnostic to the structural properties of the problem under consideration and suffer from slow convergence, being trapped in bad local minima, etc. Natural gradient descent is an acceleration technique in optimization that takes advantage of the problem’s geometric structure and preconditions the objective function’s gradient by a suitable "natural" metric. Hence parameter update directions correspond to the steepest descent on a corresponding "natural" manifold instead of the Euclidean parameter space rendering a parametrization invariant descent direction on that manifold. Despite its success in statistical inference and machine learning, the natural gradient descent method is far from a mainstream computational technique due to the computational complexity of calculating and inverting the preconditioning matrix. This work aims at a unified computational framework and streamlining the computation of a general natural gradient flow via the systematic application of efficient tools from numerical linear algebra. We obtain efficient and robust numerical methods for natural gradient flows without directly calculating, storing, or inverting the dense preconditioning matrix. We treat Euclidean, Wasserstein, Sobolev, and Fisher–Rao natural gradients in a single framework for a general loss function.