Information Geometry and its Applications III

Abstract Jan Peters

Go back to


Jan Peters  (Max Planck Institute for Biological Cybernetics, Germany)
Friday, August 06, 2010, room Hörsaal 2
Information geometry in reinforcement learning: from natural policy gradients to relative entropy policy search

Policy search is a successful approach to reinforcement learning which has yielded many interesting applications in a variety of areas.

Unlike traditional value function-based learning methods, policy search approaches can be guaranteed to converge at least to a local optimum, can handle partially observed variables and allow straightforward integration of domain knowledge. Research on policy search started with the classical work on parametric policy gradient methods. These turned out to be surprisingly slow and marred by bad trade-offs between exploration and exploitation parameters in the policy.

Results from information theory for supervised and unsupervised learning have triggered research into natural policy gradient methods. These turned out to be significantly more robust and efficient than the previous vanilla policy gradient approaches. An interesting interpretation of these results was that the policy improvements can fix the amount of loss of information while maximizing the reward. This loss of information is measured with the relative entropy between the experienced state-action distribution and the new one generated by the improved policy. We continue this path of reasoning and suggest the Relative Entropy Policy Search (REPS) method. The resulting method differs significantly from previous policy gradient approaches and yields an exact update step.

We show applications of these methods in the improvement of behaviors for robots and discuss its wider implications.

Date and Location

August 02 - 06, 2010
University of Leipzig
Augustusplatz
04103 Leipzig
Germany

Scientific Organizers

Nihat Ay
Max Planck Institute for Mathematics in the Sciences
Information Theory of Cognitive Systems Group
Germany
Contact by Email

Paolo Gibilisco
Università degli Studi di Roma "Tor Vergata"
Facoltà di Economia
Italy
Contact by Email

František Matúš
Academy of Sciences of the Czech Republic
Institute of Information Theory and Automation
Czech Republic
Contact by Email

Scientific Committee

Shun-ichi Amari
RIKEN
Brain Science Institute, Mathematical Neuroscience Laboratory
Japan
Contact by Email

Imre Csiszár
Hungarian Academy of Sciences
Alfréd Rényi Institute of Mathematics
Hungary
Contact by Email

Dénes Petz
Budapest University of Technology and Economics
Department for Mathematical Analysis
Hungary
Contact by Email

Giovanni Pistone
Collegio Carlo Alberto, Moncalieri
Italy
Contact by Email

Administrative Contact

Antje Vandenberg
Max Planck Institute for Mathematics in the Sciences
Contact by Email
Phone: (++49)-(0)341-9959-552
Fax: (++49)-(0)341-9959-555

05.04.2017, 12:42