Deep Learning Theory Group Seminar

seminar

09.04.20 21.01.22

Deep Learning Theory Group Seminar

- Past Events


21.01.22	Tiffany Vlaar (University of Edinburgh) : to be announced
20.01.22	Ruhui Jin (University of Texas at Austin) : Tensor-structured sketching for constrained optimizations Constrained least squares are fundamental in applied math, statistics and many other disciplines. The memory and computation costs are expensive in practice due to high-dimensional input data. We apply the so-called "sketching" strategy to project the problem onto a space of a much lower dimension while maintaining the approximation accuracy. Tensor structure often appears in the data matrices of optimization problems. In this talk, we will discuss: 1. Sketching designs for optimizations that have special tensor format. 2. Theoretical guarantees of tensor sketching for optimizations with general constrained sets, for instance, linear regression and sparse recovery. The sketching dimension depends on a statistical complexity that characterizes the geometry of the underlying problem.

21.01.22

Tiffany Vlaar (University of Edinburgh) : to be announced

20.01.22

Ruhui Jin (University of Texas at Austin) : Tensor-structured sketching for constrained optimizations

Constrained least squares are fundamental in applied math, statistics and many other disciplines. The memory and computation costs are expensive in practice due to high-dimensional input data. We apply the so-called "sketching" strategy to project the problem onto a space of a much lower dimension while maintaining the approximation accuracy. Tensor structure often appears in the data matrices of optimization problems. In this talk, we will discuss: 1. Sketching designs for optimizations that have special tensor format. 2. Theoretical guarantees of tensor sketching for optimizations with general constrained sets, for instance, linear regression and sparse recovery. The sketching dimension depends on a statistical complexity that characterizes the geometry of the underlying problem.


12.10.21	Johannes Müller (MPI MiS, Leipzig) : Degree of the value function (Progress report)
28.09.21	Hanna Tseran (MPI MiS, Leipzig) : How to Start Training: The Effect of Initialization and Architecture (Paper review)
21.09.21	Pierre Bréchet (MPI MiS, Leipzig) : Role(s) of the regularization in the convergence of GANs
14.09.21	Benjamin Bowman (University of California, LA) : Can we use the NTK outside the NTK regime?
07.09.21	Johannes Müller (MPI MiS, Leipzig) : NNs for PDEs: Differences to supervised learning (Bytes)
24.08.21	Hanna Tseran (MPI MiS, Leipzig) : Decision Boundary of Neural Networks (Topic overview)
27.07.21	Jiayi Li (University of California, Los Angeles) : Rademacher Complexity for Adversarial Robust Generalization (Paper review)
13.07.21	Pierre Bréchet (MPI MiS, Leipzig) : Convergence in GANs: The discrete case
29.06.21	Johannes Müller (MPI MiS, Leipzig) : Metrics and continuity in reinforcement learning (Paper review)
22.06.21	Hanna Tseran (MPI MiS, Leipzig) : Neural architecture search (Bytes)
08.06.21	Jiayi Li (University of California, Los Angeles) : Graph Coarsening with neural networks (Paper review)
01.06.21	Pierre Bréchet (MPI MiS, Leipzig) : Dynamical analysis of GANs (Bytes)
20.04.21	Yuguang Wang (MPI MiS, Leipzig) : Generalization of GNNs
13.04.21	Hanna Tseran (MPI MiS, Leipzig) : Neural Architecture Search (Topical overview)
06.04.21	Pierre Bréchet (MPI MiS, Leipzig) : Curvature in the representation space of GANs (Bytes)
30.03.21	Pradeep Banerjee (MPI MiS, Leipzig) : Information complexity and flat minima
16.03.21	Yuguang Wang (MPI MiS, Leipzig) : Information bottleneck for GNNs
09.03.21	Hanna Tseran (MPI MiS, Leipzig) : Topology for Characterizing Capacity of Neural Networks (Bytes)
02.03.21	Pierre Bréchet (MPI MiS, Leipzig) : Discrete Wasserstein Gradient Flow (Bytes)
16.02.21	Johannes Müller (MPI MiS, Leipzig) : An explicitely solvable toy model from RL
09.02.21	Hanna Tseran (MPI MiS, Leipzig) : Effects of Neural Network Initialization (Paper review)
02.02.21	Johannes Müller (MPI MiS, Leipzig) : Monte Carlo rates and neural network approximation (Bytes)


15.12.20	Yuguang Wang (MPI MiS, Leipzig) : Framelet graph convolution
01.12.20	Johannes Müller (MPI MiS, Leipzig) : Notions from RL: Overview of existing methods
17.11.20	Pierre Bréchet (MPI MiS, Leipzig) : Wasserstein Gradient Flow (III): Mean-field approximation regime of Neural Networks
05.11.20	Pierre Bréchet (MPI MiS, Leipzig) : Wasserstein Gradient Flow (II): Mean-field approximation regime of Neural Networks
17.09.20	Johannes Müller (MPI MiS, Leipzig) : Complexity reduction for optimal policies: the linear case
04.09.20	Ido Nachum (École Polytechnique Fédérale de Lausanne) : On the Information Complexity of Learning Machine learning and information theory tasks are in some sense equivalent since both involve identifying patterns and regularities in data. To recognize an elephant, a child (or a neural network) observes the repeating pattern of big ears, a trunk, and grey skin. To compress a book, a compression algorithm searches for highly repeating letters or words. So the high-level question that guided this research is: When is learning equivalent to compression? We use the quantity I(input; output), the mutual information between the training data and the output hypothesis of the learning algorithm, to measure the compression of the algorithm. Under this information-theoretic setting, these two notions are indeed equivalent. a) Compression implies learning. We will show that learning algorithms that retain a small amount of information from their input sample generalize. b) Learning implies compression. We will show that under an average-case analysis, every hypothesis class of finite VC dimension (a characterization of learnable classes) has empirical risk minimizers (ERM) that do not reveal a lot of information. If time permits, we will discuss a worst-case lower bound we proved by presenting a simple concept class for which every empirical risk minimizer (also randomized) must reveal a lot of information and also observe connections with well-studied notions such as sample compression schemes, Occam's razor, PAC-Bayes, and differential privacy.
03.09.20	Pierre Bréchet (MPI MiS, Leipzig) : Wasserstein gradient flow
27.08.20	Yuguang Wang (MPI MiS, Leipzig) : Decimated Framelet System and Fast $\gph$ -Framelet Transforms on Graph Graph representation learning has many real-world applications, from image processing, facial recognition, drug repurposing, social networks analysis to protein classification. For representation learning on graph, effective representation of graph data is crucial to learning performance. In this paper, we propose a new multiscale representation system for graph data --- decimated framelets. The framelets form a localized, tight framelet system on the graph via a data-driven filter bank. We establish discrete framelet transforms to decompose and reconstruct data on the graph under the decimated framelets. The graph framelets are built on a chain-based orthonormal basis which has fast graph Fourier transforms. From this, we then give a fast algorithm for the discrete framelet transforms on the graph --- {\fgt}, which has computational cost $\bigo N \log N$ for the graph with size $N$ . Numerical examples verify the theory of decimated framelets and {\fgt}. Application to real-world examples, such as multiresolution analysis for traffic network, node classification by graph neural networks, illustrate that decimated framelets provide a useful graph data representation and {\fgt} is critical to improving computing performance. This is joint work with Xuebin Zheng, Bingxin Zhou and Xiaosheng Zhuang.
20.08.20	Quynh Nguyen (MPI MiS, Leipzig) : Some thoughts on proving convergence of gradient descent for neural networks
13.08.20	Johannes Müller (MPI MiS, Leipzig) : The tangent kernel of parametric models: A unified approach (tentative)
06.08.20	Yuguang Wang (MPI MiS, Leipzig) : Distributed Filtered Hyperinterpolation on Manifolds
31.07.20	Quynh Nguyen (MPI MiS, Leipzig) : On connected sublevel sets in deep learning
30.07.20	Hanna Tseran (MPI MiS, Leipzig) : Tropical Geometry for Counting Linear Regions (Progress Report)
25.06.20	Pierre Bréchet (MPI MiS, Leipzig) : Probability functional descent: A unifying perspective on GANs, Variational inference, and Reinforcement learning (Paper review)
18.06.20	Hanna Tseran (MPI MiS, Leipzig) : Complexity of Linear Regions in Deep Networks - II (Paper Review)
12.06.20	Yonatan Dukler (University of California, Los Angeles) : Optimization Theory for ReLU Neural NetworksTrained with Normalization Layers The success of deep neural networks is in part due to the use of normalization layers. Normalization layers like Batch Normalization, Layer Normalization and Weight Normalization are ubiquitous in practice, as they improve generalization performance and speed up training significantly. Nonetheless, the vast majority of current deep learning theory and non-convex optimization literature focuses on the un-normalized setting, where the functions under consideration do not exhibit the properties of commonly normalized neural networks. In this paper, we bridge this gap by giving the first global convergence result for two-layer neural networks with ReLU activations trained with a normalization layer, namely Weight Normalization. Our analysis shows how the introduction of normalization layers changes the optimization landscape and can enable faster convergence as compared with un-normalized neural networks. This is joint work with Quanquan Gu and Guido Montufar.
11.06.20	Johannes Müller (MPI MiS, Leipzig) : A closer look at the approximation capabilities of neural networks (Paper review)
04.06.20	Pradeep Banerjee (MPI MiS, Leipzig) : PAC Bayes and wide minima
28.05.20	Pierre Brechet (MPI MiS, Leipzig) : Smoothness and stability of GANs (review)
21.05.20	Hanna Tseran (MPI MiS, Leipzig) : Algebraic geometry and statistical learning theory (Overview)
14.05.20	Nina Otter (University of California, Los Angeles) : Topology of deep neural networks I will discuss the preprint “Topology of deep neural networks” by Naitzat, Zhitnikov and Lim, available at https://arxiv.org/abs/2004.06093
07.05.20	Johannes Müller (MPI MiS, Leipzig) : Space-time expressivity of ResNets Residual networks (ResNets) are a deep learning architecture that substantially improved the state of the art performance in certain supervised learning tasks. Since then, they have received continuously growing attention. ResNets have a recursive structure x_{k+1} = x_k + R_k(x_k) where R_k is a neural network called a residual block. This structure can be seen as the Euler discretisation of an associated ordinary differential equation (ODE) which is called a neural ODE. Recently, ResNets were proposed as the space-time approximation of ODEs which are not of this neural type. To elaborate this connection we show that by increasing the number of residual blocks as well as their expressivity the solution of an arbitrary ODE can be approximated in space and time simultaneously by deep ReLU ResNets. Further, we derive estimates on the complexity of the residual blocks required to obtain a prescribed accuracy under certain regularity assumptions.
30.04.20	Pierre Brechet (MPI MiS, Leipzig) : Optimal transport with global invariance (Paper review)
23.04.20	Hanna Tseran (MPI MiS, Leipzig) : Neural Network Expressivity (Overview)
16.04.20	Pradeep Banerjee (MPI MiS, Leipzig) : Information theoretic generalisation bounds
09.04.20	Pradeep Banerjee (MPI MiS, Leipzig) : Non-vacuous PAC-Bayes generalization bounds

Deep Learning Theory Group Seminar

Deep Learning Theory Group Seminar

Previous Events