Institute
General information about the institute, such as our mission statement, organizational structure, staff directory, history, directions, etc.

See More
Research
Scientific profile with all research groups, topics, collaborations, as well as columns on research at the institute.

See More
News
News and press releases about the institute, as well as a media archive.

See More
- News Overview
- Press Releases
Events
Overview of all events around the institute, such as talks, seminars, lectures, workshops, conferences and public events.

See More
Publications
Overview of all scientific publications of the institute, as well as our preprint and software repositories.

See More
Career
Information on open positions at the institute, benefits of working with us, graduate school, and postdoctoral supervision.

See More

Talk

Thursday, April 11, 2024 17:00

A Non-Asymptotic Analysis of Gradient Descent for Recurrent Neural Networks

Semih Cayci (RWTH Aachen)

Live Stream

Abstract

This talk will present a non-asymptotic analysis of recurrent neural networks trained with gradient descent for the regression problem in the kernel regime without massive overparameterization. Our in-depth analysis (i) provides sharp bounds on the network size and iteration complexity in terms of the sequence length, sample size and ambient dimension, and (ii) identifies the significant impact of long-term dependencies in the dynamical system on the performance bounds characterized by a cutoff point that depends on the modulus of continuity of the activation function. Remarkably, this analysis reveals that an appropriately-initialized recurrent neural network trained with $n$ samples can achieve near-optimality with a network size $m$ that scales only logarithmically with $n$. This sharply contrasts with the prior works that require high-order polynomial dependency of $m$ on $n$ to establish strong regularity conditions. Our results are based on an explicit characterization of the class of dynamical systems that can be approximated and learned by recurrent neural networks via norm-constrained transportation mappings, and establishing local smoothness properties of the hidden state with respect to the learnable parameters.

seminar

5/2/24 5/16/24

Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

See Details

Katharina Matschke

MPI for Mathematics in the Sciences Contact via Mail

Upcoming Events of This Seminar

May 2, 2024 Early Alignment in two-layer networks training is a two edge sword with Etienne Boursier
May 16, 2024 to be announced with Rémi Gribonval