Bandits, global optimization, active learning, and Bayesian reinforcement learning -- understanding the common ground

Marc Toussaint (Stuttgart University, Germany)

E1 05 (Leibniz-Saal)

Abstract

In active learning and related scenarios the learning agent actively selects which data point should be labeled next. That is, instead of learning passively from a previously selected data set, the learning agent is in control of the data collection itself, with the aim of collecting the data which is most informative to the model. This can greatly reduce the number of necessary data points. In this tutorial I will give a basic introduction to this area, including:

What is the relation between bandits, optimization, active learning and Bayesian reinforcement learning?
What would "optimal active learning'' be?
What are typical and efficient heuristics (including regret bounds)?
What are modern Monte-Carlo tree search approaches?

Links

conference

01.09.14 04.09.14

Autonomous Learning: Summer School 2014 Autonomous Learning: Summer School 2014

MPI für Mathematik in den Naturwissenschaften Leipzig E1 05 (Leibniz-Saal)

See Details

Marion Lange

Stuttgart University / TU Berlin, Germany Contact via Mail

Nihat Ay

Max Planck Institute for Mathematics in the Sciences (Leipzig), Germany

Marc Toussaint

Stuttgart University, Germany