Search

MiS Preprint Repository

We have decided to discontinue the publication of preprints on our preprint server as of 1 March 2024. The publication culture within mathematics has changed so much due to the rise of repositories such as ArXiV (www.arxiv.org) that we are encouraging all institute members to make their preprints available there. An institute's repository in its previous form is, therefore, unnecessary. The preprints published to date will remain available here, but we will not add any new preprints here.

MiS Preprint
60/2013

Information-driven intrinsic motivation in reinforcement learning

Keyan Zahedi, Georg Martius and Nihat Ay

Abstract

One of the main challenges in the field of embodied artificial intelligence is the open-ended autonomous learning of complex behaviours. Our approach is to use task-independent, information-driven intrinsic motivation(s) to support task-dependent learning. The work presented here is a preliminary step in which we investigate the predictive information (the mutual information of the past and future of the sensor stream) as an intrinsic drive, ideally supporting any kind of task acquisition. Previous experiments have shown that the predictive information (PI) is a good candidate to support autonomous, open-ended learning of complex behaviours, because a maximisation of the PI corresponds to an exploration of morphology- and environment-dependent behavioural regularities. The idea is that these regularities can then be exploited in order to solve any given task. Three different experiments are presented and their results lead to the conclusion that the linear combination of the one-step PI with an external reward function is not generally recommended in an episodic policy gradient setting. Only for hard tasks a great speed-up can be achieved at the cost of an asymptotic performance lost.

Received:
Jun 25, 2013
Published:
Jun 25, 2013
MSC Codes:
94A15
Keywords:
Information-driven self-organisation, Predictive Information, Reinforcement Learning, Embodied Artificial Intelligence, Embodied Machine Learning

Related publications

inJournal
2013 Journal Open Access
Keyan Ghazi-Zahedi, Georg Martius and Nihat Ay

Linear combination of one-step predictive information with an external reward in an episodic policy gradient setting : a critical analysis

In: Frontiers in psychology, 4 (2013), p. 801