Search

Workshop

Intrinsically motivated exploration of behavioural modes

  • Keyan Ghazi-Zahedi (Max Planck Institute for Mathematics in the Sciences, Germany)
E1 05 (Leibniz-Saal)

Abstract

In the context of machine learning, acting to learn is most often associated with the exploration of the world model, which means that actions are performed that minimise the error of the internal representation of the external world dynamics. This essential form of acting to learn will be discussed within the first lectures of this series. This lecture will focus on a different aspect of acting to learn, which is important in the context of embodied intelligence, namely the autonomous learning of behavioral modes. The different walking gaits of a horse are a good illustration for what we describe as a behavioural mode. It is a behavior that is compliant with the body and the environment of the system, or in other words, which exploits the physical properties of the system’s body and environment such that it is optimal with respect to some criteria (e.g. energy consumption). The main focus of this lecture is the question how a system, that is initialised with minimal knowledge, can autonomously gather useful information about itself and its habitat. Examples are human infants, which have to learn their behavioral modes, e.g. grasping and walking, without any direct support by an external teacher. Although parents may prevent their children from falling, they cannot actually teach them how to walk. This form of learning must be a self-organised process. In this last lecture of the acting to learn series, we will discuss the maximisation of the predictive information as an information-driven, self-organising learning principle which leads to an autonomous learning of behavioral modes.

The lecture consists of four parts. After motivating the question in more detail, the mathematical framework will be introduced as far as it is required to understand this lecture without any a priori knowledge about information theory or the previous lectures in this series. In the second part, we will discuss the causal model of the sensorimotor loop and its utility in the context of embodied artificial intelligence. As an example, the quantification of causal information flows between actions and sensations will be derived along this model. The third part of this lecture will introduce the predictive information and its application to the sensorimotor loop. We will apply Amariś natural gradient method to derive a policy gradient for the maximisation of the predictive information and apply it to embodied systems in the last part of this lecture. The discussion of the results will also lead to insights about morphological computation.

Links

Marion Lange

Stuttgart University / TU Berlin, Germany Contact via Mail

Nihat Ay

Max Planck Institute for Mathematics in the Sciences (Leipzig), Germany

Marc Toussaint

Stuttgart University, Germany