Looking at data efficiency in RL
- Razvan Pascanu (DeepMind London)
Deep Reinforcement Learning (DRL), while providing some impressive results (e.g. on Atari, Go, etc.), is notoriously data inefficient. This is partially due to the function approximators used (deep networks) but also due to the weak learning signal (based on observing rewards).
This talk will focus on the potential role transfer learning could play in DRL for improving data efficiency. In particular the core of the talk will be centered around the different uses of the KL-regularized RL formulation explored in recent works (e.g. arxiv.org/abs/1707.04175, arxiv.org/abs/1806.01780, openreview.net/forum.
Time permitting, I will extend the discussion to focus on some work in progress observation about learning dynamics of neural networks (particularly in RL) and how to exploit the piecewise structure of the neural network (particularly the folding of the space) for efficiently learn generative models.