Upcoming Events of this Seminar

Previous Events

11.07.24 Ruihan Wu (UCSD)
Harnessing Online and Bandit Algorithms for Two Practical Challenges
13.06.24 Vahid Shahverdi (KTH)
Algebraic Complexity and Neurovariety of Linear Convolutional Networks
06.06.24 Grigoris Chrysos (UW Madison)
Are activation functions required for learning in all deep networks?
30.05.24 Mariya Toneva (MPI SWS)
Language modeling beyond language modeling
23.05.24 Yiqiao Zhong (UW Madison)
Why interpolating neural nets generalize well: recent insights from neural tangent model
16.05.24 Rémi Gribonval (University Lyon)
Conservation Laws for Gradient Flows
09.05.24 Axel Flinth (Umea University)
Achieving equivariance in neural networks
02.05.24 Etienne Boursier (INRIA Paris-Saclay)
Early Alignment in two-layer networks training is a two edge sword
25.04.24 Johannes Maly (LMU München)
On the relation between over- and reparametrization of gradient descent
18.04.24 Jörg Lücke (University of Oldenburg)
On the Convergence and Transformation of the ELBO Objective to Entropy Sums
11.04.24 Semih Cayci (RWTH Aachen)
A Non-Asymptotic Analysis of Gradient Descent for Recurrent Neural Networks
04.04.24 Joachim Bona-Pellissier (MaLGa Universita di Genova)
Geometry-induced Implicit Regularization in Deep ReLU Neural Networks
21.03.24 Hossein Taheri (UCSB)
Generalization and Stability in Interpolating Neural Networks
14.03.24 Courtney Paquette (McGill University)
Hitting the High-D(imensional) Notes: An ODE for SGD learning dynamics in high-dimensions
07.03.24 Noam Razin (Tel Aviv University)
Analyses of Policy Gradient for Language Model Finetuning and Optimal Control
22.02.24 Tim Rudner (New York University)
Data-Driven Priors for Trustworthy Machine Learning
15.02.24 Maksym Andriushchenko (EPFL)
A Modern Look at the Relationship between Sharpness and Generalization
08.02.24 Julius Berner (Caltech)
Learning ReLU networks to high uniform accuracy is intractable
01.02.24 Yulia Alexandr (University of Califorina, Berkeley)
Maximum information divergence from linear and toric models
25.01.24 Aaron Mishkin (Stanford University)
Optimal Sets and Solution Paths of ReLU Networks
18.01.24 Agustinus Kristiadi (Vector Institute)
The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
07.12.23 Molei Tao (Georgia Institute of Technology)
Implicit biases of large learning rates in machine learning
30.11.23 Pan Li (Georgia Tech)
The Stability of Positional Encoding for Graph Neural Networks and Graph Transformers
16.11.23 Sjoerd Dirksen (Utrecht University)
The separation capacity of random neural networks
09.11.23 Courtney Paquette (McGill University)
Hitting the High-D(imensional) Notes: An ODE for SGD learning dynamics in high-dimensions
Attention: this event is cancelled
02.11.23 Oscar Leong (Caltech)
Theory and Algorithms for Deep Generative Networks in Inverse Problems
19.10.23 Daniel Murfet (University of Melbourne)
Phase transitions in learning machines
12.10.23 Micah Goldblum (NYU)
Bridging the gap between deep learning theory and practice
05.10.23 Michael Murray (UCLA)
Training shallow ReLU networks on noisy data using hinge loss: when do we overfit and is it benign?
21.09.23 Florian Schaefer (Georgia Tech)
Solvers, Models, Learners: Statistical Inspiration for Scientific Computing
14.09.23 Vidya Muthukumar (Georgia Institute of Technology)
Classification versus regression in overparameterized regimes: Does the loss function matter?
07.09.23 Rahul Parhi (EPFL)
Deep Learning Meets Sparse Regularization
31.08.23 Tom Tirer (Bar-Ilan University)
Exploring Deep Neural Collapse via Extended and Penalized Unconstrained Features Models
17.08.23 Surbhi Goel (University of Pennsylvania)
Beyond Worst-case Sequential Prediction: Adversarial Robustness via Abstention
03.08.23 Haoyuan Sun (MIT)
A Unified Approach to Controlling Implicit Regularization via Mirror Descent
27.07.23 Sanghamitra Dutta (University of Maryland)
Foundations of Reliable and Lawful Machine Learning: Approaches Using Information Theory
20.07.23 Sitan Chen (Harvard University)
Theory for Diffusion Models
13.07.23 Behrooz Tahmasebi (MIT)
Sample Complexity Gain from Invariances: Kernel Regression, Wasserstein Distance, and Density Estimation
06.07.23 Ronen Basri (Weizmann Institute of Science)
On the Connection between Deep Neural Networks and Kernel Methods
29.06.23 Bingbin Liu (Carnegie Mellon University)
Thinking Fast with Transformers: Algorithmic Reasoning via Shortcuts
15.06.23 Benjamin Bowman (UCLA and AWS)
On the Spectral Bias of Neural Networks in the Neural Tangent Kernel Regime
08.06.23 Simone Bombari (IST Austria)
Optimization, Robustness and Privacy. A Story through the Lens of Concentration
01.06.23 Rishi Sonthalia (UCLA)
Least Squares Denoising: Non-IID Data, Transfer Learning and Under-parameterized Double Descent
25.05.23 Łukasz Dębowski (IPI PAN)
A Simplistic Model of Neural Scaling Laws: Multiperiodic Santa Fe Processes
18.05.23 Duc N.M Hoang (University of Texas, Austin)
Revisiting Pruning as Initialization through the Lens of Ramanujan Graph
11.05.23 Henning Petzka (RWTH Aachen University)
Relative flatness and generalization
27.04.23 Stéphane d'Ascoli (ENS and FAIR Paris)
Double descent: insights from the random feature model
20.04.23 Aditya Grover (UCLA)
Generative Decision Making Under Uncertainty
13.04.23 Wei Zhu (University of Massachusetts Amherst)
Implicit Bias of Linear Equivariant Steerable Networks
06.04.23 Jaehoon Lee (Google Brain)
Exploring Infinite-Width Limit of Deep Neural Networks
23.03.23 Dohyun Kwon (University of Wisconsin-Madison)
Convergence of score-based generative modeling
16.03.23 Chulhee Yun (KAIST)
Shuffling-based stochastic optimization methods: bridging the theory-practice gap
09.03.23 Ricardo S. Baptista (Caltech)
On the representation and learning of triangular transport maps
02.03.23 Samy Wu Fung (Colorado School of Mines)
Using Hamilton Jacobi PDEs in Optimization
23.02.23 Jalaj Bhandari (Columbia University)
Global guarantees for policy gradient methods.
16.02.23 Julia Elisenda Grigsby (Boston College)
Functional dimension of ReLU Networks
09.02.23 Renjie Liao (University of British Columbia)
Gaussian-Bernoulli RBMs Without Tears
02.02.23 Mihaela Rosca (DeepMind)
On continuous time models of gradient descent and instability in deep learning
26.01.23 Liu Ziyin (University of Tokyo)
The Probabilistic Stability and Low-Rank Bias of SGD
19.01.23 Xiaowu Dai (UCLA)
Kernel Ordinary Differential Equations
12.01.23 Felipe Suárez-Colmenares (MIT)
Learning threshold neurons via the "edge of stability”
16.12.22 Jona Lelmi (University of Bonn)
Large data limit of the MBO scheme for data clustering
15.12.22 Nicolás García Trillos (University of Wisconsin-Madison)
Analysis of adversarial robustness and of other problems in modern machine learning
08.12.22 Mojtaba Sahraee-Ardakan (UCLA)
Equivalence of Kernel Methods and Linear Models in High Dimensions
01.12.22 Levon Nurbekyan (UCLA)
Efficient natural gradient method for large-scale optimization problems
17.11.22 Sebastian Goldt (SISSA Trieste)
On the importance of higher-order input statistics for learning in neural networks
10.11.22 El Mehdi Achour (IMT Toulouse)
The loss landscape of deep linear neural networks: A second-order analysis
03.11.22 Itay Safran (Purdue University)
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
27.10.22 Lechao Xiao (Google Brain)
Eigenspace restructuring: A Principle of space and frequency in neural networks
20.10.22 Joe Kileel (University of Texas at Austin)
The Effect of Parametrization on Nonconvex Optimization Landscapes
13.10.22 Chao Ma (Stanford University)
Implicit bias of optimization algorithms for neural networks: static and dynamic perspectives
06.10.22 Felix Dietrich (Technical University of Munich)
Learning dynamical systems from data
22.09.22 Gintare Karolina Dziugaite (Google Brain)
Deep Learning through the Lens of Data
15.09.22 Lisa Maria Kreusser (University of Bath)
Wasserstein GANs Work Because They Fail (to Approximate the Wasserstein Distance)
08.09.22 Devansh Arpit (Salesforce)
Optimization Aspects that Improve IID and OOD Generalization in Deep Learning
25.08.22 Melanie Weber (Harvard University)
Exploiting geometric structure in (matrix-valued) optimization
18.08.22 Aditya Golatkar (UCLA)
Selective Forgetting and Mixed Privacy in Deep Networks
12.08.22 Francesco Di Giovanni (Twitter)
Over-squashing and over-smoothing through the lenses of curvature and multi-particle dynamics
04.08.22 Wei Hu (University of California, Berkeley)
More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize
28.07.22 Arash A. Amini (UCLA)
Target alignment in truncated kernel ridge regression
21.07.22 Mark Schmidt (University of British Columbia)
Optimization Algorithms for Training Over-Parameterized Models
14.07.22 Rong Ge (Duke University)
Towards Understanding Training Dynamics for Mildly Overparametrized Models
07.07.22 Sebastian Kassing (University of Bielefeld)
Convergence of Stochastic Gradient Descent for analytic target functions
23.06.22 Lisa Maria Kreusser (University of Bath)
Wasserstein GANs Work Because They Fail (to Approximate the Wasserstein Distance)
Attention: this event is cancelled
16.06.22 Jingfeng Wu (Johns Hopkins University)
A Fine-Grained Characterization for the Implicit Bias of SGD in Least Square Problems
09.06.22 Guang Cheng (UCLA)
Nonparametric Perspective on Deep Learning
02.06.22 Tan Nguyen (UCLA)
Principled Models for Machine Learning
26.05.22 Melikasadat Emami (UCLA)
Asymptotics of Learning in Generalized Linear Models and Recurrent Neural Networks
19.05.22 Sayan Mukherjee (MPI MiS, Leipzig + Universität Leipzig)
Inference in dynamical systems and the geometry of learning group actions
12.05.22 Zixiang Chen (UCLA)
Benign Overfitting in Two-layer Convolutional Neural Networks
05.05.22 Noam Razin (Tel Aviv University)
Generalization in Deep Learning Through the Lens of Implicit Rank Minimization
21.04.22 Berfin Şimşek (Ecole Polytechnique Fédérale de Lausanne (EPFL))
The Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetry-Induced Saddles and Global Minima Manifold
14.04.22 Christoph Lampert (Institute of Science and Technology, Austria)
Robust and Fair Multisource Learning
07.04.22 Chaoyue Liu (Ohio State University)
Transition to Linearity of Wide Neural Networks
31.03.22 Benjamin Eysenbach (Carnegie Mellon University & Google Brain)
The Information Geometry of Unsupervised Reinforcement Learning
24.03.22 Ohad Shamir (Weizmann Institute of Science)
Implicit bias in machine learning
17.03.22 Gabin Maxime Nguegnang (RWTH Aachen University)
Convergence of gradient descent for learning deep linear neural networks
11.03.22 Uri Alon (CMU)
Bottlenecks, over-squashing, and attention in Graph Neural Networks.
10.03.22 Gergely Neu (Universitat Pompeu Fabra, Barcelona)
Generalization Bounds via Convex Analysis
04.03.22 Daniel McKenzie (UCLA)
Implicit Neural Networks: What they are and how to use them.
03.03.22 Marco Mondelli (Institute of Science and Technology Austria)
Gradient Descent for Deep Neural Networks: New Perspectives from Mean-field and NTK
24.02.22 Danijar Hafner (Google Brain & University of Toronto)
General Infomax Agents through World Models
17.02.22 Maksim Velikanov (Skolkovo Institute of Science and Technology)
Spectral properties of wide neural networks and their implications to the convergence speed of different Gradient Descent algorithms
10.02.22 Hui Jin (University of California, Los Angeles)
Learning curves for Gaussian process regression with power-law priors and targets
04.02.22 Song Mei (University of California, Berkeley)
The efficiency of kernel methods on structured datasets
27.01.22 Daniel Russo (Columbia University)
Adaptivity and Confounding in Multi-armed Bandit Experiments
20.01.22 Philipp Petersen (University of Vienna)
Optimal learning in high-dimensional classification problems
13.01.22 Yuan Cao (University of California, LA)
Towards Understanding and Advancing the Generalization of Adam in Deep Learning
16.12.21 Zhi-Qin John Xu (Shanghai Jiao Tong University)
Occam’s razors in neural networks: frequency principle in training and embedding principle of loss landscape
02.12.21 Ekaterina Lobacheva (HSE University), Maxim Kodryan (HSE University)
On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay
25.11.21 Michael Joswig (TU Berlin)
Geometric Disentanglement by Random Convex Polytopes
18.11.21 Daniel Soudry (Technion)
Algorithmic Bias Control in Deep learning
11.11.21 Christoph Hertrich (TU Berlin)
Towards Lower Bounds on the Depth of ReLU Neural Networks
04.11.21 Hyeyoung Park (Kyungpook National University)
Effect of Geometrical Singularity on Learning Dynamics of Neural Networks
21.10.21 David Stutz (MPI for Informatics, Saarbrücken)
Relating Adversarial Robustness and Weight Robustness Through Flatness
14.10.21 Mathias Drton (Technical University of Munich)
A Bayesian information criterion for singular models
07.10.21 Roger Grosse (University of Toronto)
Self-tuning networks: Amortizing the hypergradient computation for hyperparameter optimization
23.09.21 Huan Xiong (MBZUAI, Abu Dhabi)
On the Number of Linear Regions of Convolutional Neural Networks with Piecewise Linear Activations
16.09.21 Pavel Izmailov (New York University)
What Are Bayesian Neural Network Posteriors Really Like?
09.09.21 Shaowei Lin (working in a stealth startup)
All you need is relative information
02.09.21 Matthew Trager (Amazon US)
A Geometric View of Functional Spaces of Neural Networks
26.08.21 Guodong Zhang (University of Toronto)
Differentiable Game Dynamics: Hardness and Complexity of Equilibrium Learning
22.07.21 Soledad Villar (Johns Hopkins University)
Scalars are universal: Gauge-equivariant machine learning, structured like classical physics
15.07.21 Hossein Mobahi (Google Research)
Self-Distillation Amplifies Regularization in Hilbert Space
08.07.21 Roger Grosse (University of Toronto)
Self-tuning networks: Amortizing the hypergradient computation for hyperparameter optimization
01.07.21 Dominik Janzing (Amazon Research, Tuebingen, Germany)
Why we should prefer simple causal models
17.06.21 Ziv Goldfeld (Cornell University)
Scaling Wasserstein distances to high dimensions via smoothing
10.06.21 Isabel Valera (Saarland University and MPI for Intelligent Systems)
Algorithmic recourse: theory and practice
03.06.21 Stanisław Jastrzębski ( and Jagiellonian University)
Reverse-engineering implicit regularization due to large learning rates in deep learning
27.05.21 Sebastian Reich (University of Potsdam and University of Reading)
Statistical inverse problems and affine-invariant gradient flow structures in the space of probability measures