On the Natural Gradient for Deep Learning


The natural gradient method is one of the most prominent information-geometric methods within the field of machine learning.

It was proposed by Amari in 1998 and uses the Fisher-Rao metric as Riemannian metric for the definition of a gradient within optimisation tasks. Since then it proved to be extremely efficient in the context of neural networks, reinforcement learning, and robotics. In recent years, attempts have been made to apply the natural gradient method for training deep neural networks. However, due to the huge number of parameters of such networks, the method is currently not directly applicable in this context. In my presentation, I outline ways to simplify the natural gradient for deep learning. Corresponding simplifications are related to the locality of learning associated with the underlying network structure.

Antje Vandenberg

Max Planck Institute for Mathematics in the Sciences (Leipzig), Germany Contact via Mail

Nihat Ay

Max Planck Institute for Mathematics in the Sciences (Leipzig), Germany

Mikhail Prokopenko

University of Sydney, Australia