

Information Geometry and its Applications III
Abstract Jan Peters
Go back to
Jan Peters (Max Planck Institute for Biological Cybernetics, Germany)
Friday, August 06, 2010, room Hörsaal 2
Information geometry in reinforcement learning: from natural policy gradients to relative entropy policy search
Policy search is a successful approach to reinforcement learning which has yielded many interesting applications in a variety of areas.
Unlike traditional value function-based learning methods, policy search approaches can be guaranteed to converge at least to a local optimum, can handle partially observed variables and allow straightforward integration of domain knowledge. Research on policy search started with the classical work on parametric policy gradient methods. These turned out to be surprisingly slow and marred by bad trade-offs between exploration and exploitation parameters in the policy.
Results from information theory for supervised and unsupervised learning have triggered research into natural policy gradient methods. These turned out to be significantly more robust and efficient than the previous vanilla policy gradient approaches. An interesting interpretation of these results was that the policy improvements can fix the amount of loss of information while maximizing the reward. This loss of information is measured with the relative entropy between the experienced state-action distribution and the new one generated by the improved policy. We continue this path of reasoning and suggest the Relative Entropy Policy Search (REPS) method. The resulting method differs significantly from previous policy gradient approaches and yields an exact update step.
We show applications of these methods in the improvement of behaviors for robots and discuss its wider implications.
Date and Location
August 02 - 06, 2010
University of Leipzig
Augustusplatz
04103 Leipzig
Germany
Scientific Organizers
Nihat AyMax Planck Institute for Mathematics in the Sciences
Information Theory of Cognitive Systems Group
Germany
Contact by Email
Paolo Gibilisco
Università degli Studi di Roma "Tor Vergata"
Facoltà di Economia
Italy
Contact by Email
František Matúš
Academy of Sciences of the Czech Republic
Institute of Information Theory and Automation
Czech Republic
Contact by Email
Scientific Committee
Shun-ichi Amari
RIKEN
Brain Science Institute, Mathematical Neuroscience Laboratory
Japan
Contact by Email
Imre Csiszár
Hungarian Academy of Sciences
Alfréd Rényi Institute of Mathematics
Hungary
Contact by Email
Dénes Petz
Budapest University of Technology and Economics
Department for Mathematical Analysis
Hungary
Contact by Email
Giovanni Pistone
Collegio Carlo Alberto, Moncalieri
Italy
Contact by Email
Administrative Contact
Antje Vandenberg
Max Planck Institute for Mathematics in the Sciences
Contact by Email
Phone: (++49)-(0)341-9959-552
Fax: (++49)-(0)341-9959-555