On the Natural Gradient for Deep Learning

  • Nihat Ay (Max Planck Institute for Mathematics in the Sciences)
E1 05 (Leibniz-Saal)


The natural gradient method is one of the most prominent information-geometric methods within the field of machine learning.

It was proposed by Amari in 1998 and uses the Fisher-Rao metric as Riemannian metric for the definition of a gradient within optimisation tasks. Since then it proved to be extremely efficient in the context of neural networks, reinforcement learning, and robotics. In recent years, attempts have been made to apply the natural gradient method for training deep neural networks. However, due to the huge number of parameters of such networks, the method is currently not directly applicable in this context. In my presentation, I outline ways to simplify the natural gradient for deep learning. Corresponding simplifications are related to the locality of learning associated with the underlying network structure.

Valeria Hünniger

Max-Planck-Institut für Mathematik in den Naturwissenschaften Contact via Mail

Guido Montúfar

Max Planck Institute for Mathematics in the Sciences