Institut
Allgemeine Informationen über das Institut wie z.B. Leitbild, Organisationsstruktur, Mitarbeiterverzeichnis, Geschichte, Anfahrtsbeschreibung usw.

Mehr anzeigen
Forschung
Wissenschaftliches Profil mit allen Forschungsgruppen, Themen, Projekten sowie weiteren Forschungsrubriken des Instituts.

Mehr anzeigen
News
Nachrichten und Pressemitteilungen des Instituts, sowie ein Pressearchiv.

Mehr anzeigen
Veranstaltungen
Übersicht über alle Veranstaltungen rund um das Institut, wie Vorträge, Seminare, Vorlesungen, Workshops, Konferenzen und öffentliche Veranstaltungen.

Mehr anzeigen
Publikationen
Übersicht über alle wissenschaftlichen Veröffentlichungen des Instituts sowie über unsere Preprint- und Software-Archive.

Mehr anzeigen
Karriere
Informationen zu Stellenangeboten am Institut, Karriere Benefits, dem Graduiertenkolleg und der Betreuung von Postdocs.

Mehr anzeigen

Talk

07.10.21, 17:00

Self-tuning networks: Amortizing the hypergradient computation for hyperparameter optimization

Roger Grosse (University of Toronto)

Live Stream

Abstract

Optimization of many deep learning hyperparameters can be formulated as a bilevel optimization problem. While most black-box and gradient-based approaches require many independent training runs, we aim to adapt hyperparameters online as the network trains. The main challenge is to approximate the response Jacobian, which captures how the minimum of the inner objective changes as the hyperparameters are perturbed. To do this, we introduce the self-tuning network (STN), which fits a hypernetwork to approximate the best response function in the vicinity of the current hyperparameters. Differentiating through the hypernetwork lets us efficiently approximate the gradient of the validation loss with respect to the hyperparameters. We train the hypernetwork and hyperparameters jointly. Empirically, we can find hyperparameter settings competitive with Bayesian Optimization in a single run of training, and in some cases find hyperparameter schedules that outperform any fixed hyperparameter value.

Links

seminar

02.04.20 16.04.26

Math Machine Learning seminar MPI MIS + UCLA Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

Details anzeigen