Benjamin Fehrman :
Convergence rates for the stochastic gradient descent method for non-convex objective functions
In this talk, which is based on joint work with Benjamin Gess and Arnulf Jentzen, we establish a rate of convergence to minima for the stochastic gradient descent method in the case of an objective function that is not necessarily globally, or locally, convex nor globally attracting. We do not assume that the critical points of the objective function are nondegenerate, which allows for the type degeneracies observed practically in the optimization of certain neural networks. Our analysis and estimates rely on the use of mini-batches in a quantitative way in order to control the loss of iterates to non-attracting regions.
Marc Josien :
Integrodifferential equations for simulating moving dislocations
In this talk, we introduce an integrodifferential equation motivated by dislocation dynamics in material sciences . This equation involves a stiff convolution kernel that is nonlocal in time and in space (there is a memory effect). Its simulation raises two types of questions: the stability of the employed numerical method and the algorithmic efficiency of the method, both from the points of view of the complexity and the memory requirement. The challenge is to store as little memory as possible, and to use it. We describe two families of fast methods dealing improving the algorithmic efficiency: the first one relying on the Fast Fourier Transform , and the second on the Laplace transform .
- E. Hairer, C. Lubich, and M. Schlichte.
Fast numerical solution of nonlinear Volterra convolution equations.
SIAM J. Sci. Statist. Comput., 6(3):532--541, 1985.
- C. Lubich and A. Schädle.
Fast convolution for nonreflecting boundary conditions.
SIAM J. Sci. Comput., 24(1):161--182, 2002.
- P. Pellegrini.
Dynamic Peierls-Nabarro equations for elastically isotropic crystals.
Phys. Rev. B, 81:024101, 2010
Orlando Marigliano :
Polynomials in nonparametric inference
Can algebra help statistitians? I report on a work in progress on nonparametric statistical inference using polynomials for approximation. The results we would like to obtain depend on theorems about multivariate real polynomials yet to be proven.
Thomas Merkh :
Universal approximation bounds for deep stochastic feedforward networks
This work is focused on the representational power of deep stochastic feedforward neural networks. Of great interest in the graphical models and machine learning communities has been in understanding the representational capacity differences when using deep versus shallow architectures. Here, bounds on the width and depth of a deep network to be able to approximate any stochastic mapping from the set of inputs to the set of outputs arbitrarily well, are determined. Such a networkcan be regarded as a universal approximator. These bounds were determined by analyzing the network in terms of compositions of linear transformations of probability distributions expressible by the layers of the network. This analysis has revealed a spectrum of networks capable of universal approximation where as the layer width is decreased, the network depth must increase proportionally. Furthermore, an explicit construction of a network with the minimum number of trainable parameters necessary for universal approximation was determined.
Ivan Yamshchikov :
Information representations for natural language processing
Effective representation of textual information is crucial for a number of task in natural language processing. Representations that could decompose different aspects of texts into separate interpretable components are arguably particularly interesting. In this talk we briefly discuss current approaches to this problem as well as bottlenecks that at the moment hinder development of universal and effective information representations applicable for a variety of NLP tasks.