A Non-Asymptotic Analysis of Gradient Descent for Recurrent Neural Networks
- Semih Cayci (RWTH Aachen)
Abstract
This talk will present a non-asymptotic analysis of recurrent neural networks trained with gradient descent for the regression problem in the kernel regime without massive overparameterization. Our in-depth analysis (i) provides sharp bounds on the network size and iteration complexity in terms of the sequence length, sample size and ambient dimension, and (ii) identifies the significant impact of long-term dependencies in the dynamical system on the performance bounds characterized by a cutoff point that depends on the modulus of continuity of the activation function. Remarkably, this analysis reveals that an appropriately-initialized recurrent neural network trained with