Search

Upcoming Events of This Seminar

Previous Events

Apr 25, 2024 Johannes Maly (LMU München)
On the relation between over- and reparametrization of gradient descent
Apr 18, 2024 Jörg Lücke (University of Oldenburg)
On the Convergence and Transformation of the ELBO Objective to Entropy Sums
Apr 11, 2024 Semih Cayci (RWTH Aachen)
A Non-Asymptotic Analysis of Gradient Descent for Recurrent Neural Networks
Apr 4, 2024 Joachim Bona-Pellissier (MaLGa Universita di Genova)
Geometry-induced Implicit Regularization in Deep ReLU Neural Networks
Mar 21, 2024 Hossein Taheri (UCSB)
Generalization and Stability in Interpolating Neural Networks
Mar 14, 2024 Courtney Paquette (McGill University)
Hitting the High-D(imensional) Notes: An ODE for SGD learning dynamics in high-dimensions
Mar 7, 2024 Noam Razin (Tel Aviv University)
Analyses of Policy Gradient for Language Model Finetuning and Optimal Control
Feb 22, 2024 Tim Rudner (New York University)
Data-Driven Priors for Trustworthy Machine Learning
Feb 15, 2024 Maksym Andriushchenko (EPFL)
A Modern Look at the Relationship between Sharpness and Generalization
Feb 8, 2024 Julius Berner (Caltech)
Learning ReLU networks to high uniform accuracy is intractable
Feb 1, 2024 Yulia Alexandr (University of Califorina, Berkeley)
Maximum information divergence from linear and toric models
Jan 25, 2024 Aaron Mishkin (Stanford University)
Optimal Sets and Solution Paths of ReLU Networks
Jan 18, 2024 Agustinus Kristiadi (Vector Institute)
The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
Dec 7, 2023 Molei Tao (Georgia Institute of Technology)
Implicit biases of large learning rates in machine learning
Nov 30, 2023 Pan Li (Georgia Tech)
The Stability of Positional Encoding for Graph Neural Networks and Graph Transformers
Nov 16, 2023 Sjoerd Dirksen (Utrecht University)
The separation capacity of random neural networks
Nov 9, 2023 Courtney Paquette (McGill University)
Hitting the High-D(imensional) Notes: An ODE for SGD learning dynamics in high-dimensions
Attention: this event is cancelled
Nov 2, 2023 Oscar Leong (Caltech)
Theory and Algorithms for Deep Generative Networks in Inverse Problems
Oct 19, 2023 Daniel Murfet (University of Melbourne)
Phase transitions in learning machines
Oct 12, 2023 Micah Goldblum (NYU)
Bridging the gap between deep learning theory and practice
Oct 5, 2023 Michael Murray (UCLA)
Training shallow ReLU networks on noisy data using hinge loss: when do we overfit and is it benign?
Sep 21, 2023 Florian Schaefer (Georgia Tech)
Solvers, Models, Learners: Statistical Inspiration for Scientific Computing
Sep 14, 2023 Vidya Muthukumar (Georgia Institute of Technology)
Classification versus regression in overparameterized regimes: Does the loss function matter?
Sep 7, 2023 Rahul Parhi (EPFL)
Deep Learning Meets Sparse Regularization
Aug 31, 2023 Tom Tirer (Bar-Ilan University)
Exploring Deep Neural Collapse via Extended and Penalized Unconstrained Features Models
Aug 17, 2023 Surbhi Goel (University of Pennsylvania)
Beyond Worst-case Sequential Prediction: Adversarial Robustness via Abstention
Aug 3, 2023 Haoyuan Sun (MIT)
A Unified Approach to Controlling Implicit Regularization via Mirror Descent
Jul 27, 2023 Sanghamitra Dutta (University of Maryland)
Foundations of Reliable and Lawful Machine Learning: Approaches Using Information Theory
Jul 20, 2023 Sitan Chen (Harvard University)
Theory for Diffusion Models
Jul 13, 2023 Behrooz Tahmasebi (MIT)
Sample Complexity Gain from Invariances: Kernel Regression, Wasserstein Distance, and Density Estimation
Jul 6, 2023 Ronen Basri (Weizmann Institute of Science)
On the Connection between Deep Neural Networks and Kernel Methods
Jun 29, 2023 Bingbin Liu (Carnegie Mellon University)
Thinking Fast with Transformers: Algorithmic Reasoning via Shortcuts
Jun 15, 2023 Benjamin Bowman (UCLA and AWS)
On the Spectral Bias of Neural Networks in the Neural Tangent Kernel Regime
Jun 8, 2023 Simone Bombari (IST Austria)
Optimization, Robustness and Privacy. A Story through the Lens of Concentration
Jun 1, 2023 Rishi Sonthalia (UCLA)
Least Squares Denoising: Non-IID Data, Transfer Learning and Under-parameterized Double Descent
May 25, 2023 Łukasz Dębowski (IPI PAN)
A Simplistic Model of Neural Scaling Laws: Multiperiodic Santa Fe Processes
May 18, 2023 Duc N.M Hoang (University of Texas, Austin)
Revisiting Pruning as Initialization through the Lens of Ramanujan Graph
May 11, 2023 Henning Petzka (RWTH Aachen University)
Relative flatness and generalization
Apr 27, 2023 Stéphane d'Ascoli (ENS and FAIR Paris)
Double descent: insights from the random feature model
Apr 20, 2023 Aditya Grover (UCLA)
Generative Decision Making Under Uncertainty
Apr 13, 2023 Wei Zhu (University of Massachusetts Amherst)
Implicit Bias of Linear Equivariant Steerable Networks
Apr 6, 2023 Jaehoon Lee (Google Brain)
Exploring Infinite-Width Limit of Deep Neural Networks
Mar 23, 2023 Dohyun Kwon (University of Wisconsin-Madison)
Convergence of score-based generative modeling
Mar 16, 2023 Chulhee Yun (KAIST)
Shuffling-based stochastic optimization methods: bridging the theory-practice gap
Mar 9, 2023 Ricardo S. Baptista (Caltech)
On the representation and learning of triangular transport maps
Mar 2, 2023 Samy Wu Fung (Colorado School of Mines)
Using Hamilton Jacobi PDEs in Optimization
Feb 23, 2023 Jalaj Bhandari (Columbia University)
Global guarantees for policy gradient methods.
Feb 16, 2023 Julia Elisenda Grigsby (Boston College)
Functional dimension of ReLU Networks
Feb 9, 2023 Renjie Liao (University of British Columbia)
Gaussian-Bernoulli RBMs Without Tears
Feb 2, 2023 Mihaela Rosca (DeepMind)
On continuous time models of gradient descent and instability in deep learning
Jan 26, 2023 Liu Ziyin (University of Tokyo)
The Probabilistic Stability and Low-Rank Bias of SGD
Jan 19, 2023 Xiaowu Dai (UCLA)
Kernel Ordinary Differential Equations
Jan 12, 2023 Felipe Suárez-Colmenares (MIT)
Learning threshold neurons via the "edge of stability”
Dec 16, 2022 Jona Lelmi (University of Bonn)
Large data limit of the MBO scheme for data clustering
Dec 15, 2022 Nicolás García Trillos (University of Wisconsin-Madison)
Analysis of adversarial robustness and of other problems in modern machine learning
Dec 8, 2022 Mojtaba Sahraee-Ardakan (UCLA)
Equivalence of Kernel Methods and Linear Models in High Dimensions
Dec 1, 2022 Levon Nurbekyan (UCLA)
Efficient natural gradient method for large-scale optimization problems
Nov 17, 2022 Sebastian Goldt (SISSA Trieste)
On the importance of higher-order input statistics for learning in neural networks
Nov 10, 2022 El Mehdi Achour (IMT Toulouse)
The loss landscape of deep linear neural networks: A second-order analysis
Nov 3, 2022 Itay Safran (Purdue University)
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
Oct 27, 2022 Lechao Xiao (Google Brain)
Eigenspace restructuring: A Principle of space and frequency in neural networks
Oct 20, 2022 Joe Kileel (University of Texas at Austin)
The Effect of Parametrization on Nonconvex Optimization Landscapes
Oct 13, 2022 Chao Ma (Stanford University)
Implicit bias of optimization algorithms for neural networks: static and dynamic perspectives
Oct 6, 2022 Felix Dietrich (Technical University of Munich)
Learning dynamical systems from data
Sep 22, 2022 Gintare Karolina Dziugaite (Google Brain)
Deep Learning through the Lens of Data
Sep 15, 2022 Lisa Maria Kreusser (University of Bath)
Wasserstein GANs Work Because They Fail (to Approximate the Wasserstein Distance)
Sep 8, 2022 Devansh Arpit (Salesforce)
Optimization Aspects that Improve IID and OOD Generalization in Deep Learning
Aug 25, 2022 Melanie Weber (Harvard University)
Exploiting geometric structure in (matrix-valued) optimization
Aug 18, 2022 Aditya Golatkar (UCLA)
Selective Forgetting and Mixed Privacy in Deep Networks
Aug 12, 2022 Francesco Di Giovanni (Twitter)
Over-squashing and over-smoothing through the lenses of curvature and multi-particle dynamics
Aug 4, 2022 Wei Hu (University of California, Berkeley)
More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize
Jul 28, 2022 Arash A. Amini (UCLA)
Target alignment in truncated kernel ridge regression
Jul 21, 2022 Mark Schmidt (University of British Columbia)
Optimization Algorithms for Training Over-Parameterized Models
Jul 14, 2022 Rong Ge (Duke University)
Towards Understanding Training Dynamics for Mildly Overparametrized Models
Jul 7, 2022 Sebastian Kassing (University of Bielefeld)
Convergence of Stochastic Gradient Descent for analytic target functions
Jun 23, 2022 Lisa Maria Kreusser (University of Bath)
Wasserstein GANs Work Because They Fail (to Approximate the Wasserstein Distance)
Attention: this event is cancelled
Jun 16, 2022 Jingfeng Wu (Johns Hopkins University)
A Fine-Grained Characterization for the Implicit Bias of SGD in Least Square Problems
Jun 9, 2022 Guang Cheng (UCLA)
Nonparametric Perspective on Deep Learning
Jun 2, 2022 Tan Nguyen (UCLA)
Principled Models for Machine Learning
May 26, 2022 Melikasadat Emami (UCLA)
Asymptotics of Learning in Generalized Linear Models and Recurrent Neural Networks
May 19, 2022 Sayan Mukherjee (MPI MiS, Leipzig + Universität Leipzig)
Inference in dynamical systems and the geometry of learning group actions
May 12, 2022 Zixiang Chen (UCLA)
Benign Overfitting in Two-layer Convolutional Neural Networks
May 5, 2022 Noam Razin (Tel Aviv University)
Generalization in Deep Learning Through the Lens of Implicit Rank Minimization
Apr 21, 2022 Berfin Şimşek (Ecole Polytechnique Fédérale de Lausanne (EPFL))
The Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetry-Induced Saddles and Global Minima Manifold
Apr 14, 2022 Christoph Lampert (Institute of Science and Technology, Austria)
Robust and Fair Multisource Learning
Apr 7, 2022 Chaoyue Liu (Ohio State University)
Transition to Linearity of Wide Neural Networks
Mar 31, 2022 Benjamin Eysenbach (Carnegie Mellon University & Google Brain)
The Information Geometry of Unsupervised Reinforcement Learning
Mar 24, 2022 Ohad Shamir (Weizmann Institute of Science)
Implicit bias in machine learning
Mar 17, 2022 Gabin Maxime Nguegnang (RWTH Aachen University)
Convergence of gradient descent for learning deep linear neural networks
Mar 11, 2022 Uri Alon (CMU)
Bottlenecks, over-squashing, and attention in Graph Neural Networks.
Mar 10, 2022 Gergely Neu (Universitat Pompeu Fabra, Barcelona)
Generalization Bounds via Convex Analysis
Mar 4, 2022 Daniel McKenzie (UCLA)
Implicit Neural Networks: What they are and how to use them.
Mar 3, 2022 Marco Mondelli (Institute of Science and Technology Austria)
Gradient Descent for Deep Neural Networks: New Perspectives from Mean-field and NTK
Feb 24, 2022 Danijar Hafner (Google Brain & University of Toronto)
General Infomax Agents through World Models
Feb 17, 2022 Maksim Velikanov (Skolkovo Institute of Science and Technology)
Spectral properties of wide neural networks and their implications to the convergence speed of different Gradient Descent algorithms
Feb 10, 2022 Hui Jin (University of California, Los Angeles)
Learning curves for Gaussian process regression with power-law priors and targets
Feb 4, 2022 Song Mei (University of California, Berkeley)
The efficiency of kernel methods on structured datasets
Jan 27, 2022 Daniel Russo (Columbia University)
Adaptivity and Confounding in Multi-armed Bandit Experiments
Jan 20, 2022 Philipp Petersen (University of Vienna)
Optimal learning in high-dimensional classification problems
Jan 13, 2022 Yuan Cao (University of California, LA)
Towards Understanding and Advancing the Generalization of Adam in Deep Learning
Dec 16, 2021 Zhi-Qin John Xu (Shanghai Jiao Tong University)
Occam’s razors in neural networks: frequency principle in training and embedding principle of loss landscape
Dec 2, 2021 Ekaterina Lobacheva (HSE University), Maxim Kodryan (HSE University)
On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay
Nov 25, 2021 Michael Joswig (TU Berlin)
Geometric Disentanglement by Random Convex Polytopes
Nov 18, 2021 Daniel Soudry (Technion)
Algorithmic Bias Control in Deep learning
Nov 11, 2021 Christoph Hertrich (TU Berlin)
Towards Lower Bounds on the Depth of ReLU Neural Networks
Nov 4, 2021 Hyeyoung Park (Kyungpook National University)
Effect of Geometrical Singularity on Learning Dynamics of Neural Networks
Oct 21, 2021 David Stutz (MPI for Informatics, Saarbrücken)
Relating Adversarial Robustness and Weight Robustness Through Flatness
Oct 14, 2021 Mathias Drton (Technical University of Munich)
A Bayesian information criterion for singular models
Oct 7, 2021 Roger Grosse (University of Toronto)
Self-tuning networks: Amortizing the hypergradient computation for hyperparameter optimization
Sep 23, 2021 Huan Xiong (MBZUAI, Abu Dhabi)
On the Number of Linear Regions of Convolutional Neural Networks with Piecewise Linear Activations
Sep 16, 2021 Pavel Izmailov (New York University)
What Are Bayesian Neural Network Posteriors Really Like?
Sep 9, 2021 Shaowei Lin (working in a stealth startup)
All you need is relative information
Sep 2, 2021 Matthew Trager (Amazon US)
A Geometric View of Functional Spaces of Neural Networks
Aug 26, 2021 Guodong Zhang (University of Toronto)
Differentiable Game Dynamics: Hardness and Complexity of Equilibrium Learning
Jul 22, 2021 Soledad Villar (Johns Hopkins University)
Scalars are universal: Gauge-equivariant machine learning, structured like classical physics
Jul 15, 2021 Hossein Mobahi (Google Research)
Self-Distillation Amplifies Regularization in Hilbert Space
Jul 8, 2021 Roger Grosse (University of Toronto)
Self-tuning networks: Amortizing the hypergradient computation for hyperparameter optimization
Jul 1, 2021 Dominik Janzing (Amazon Research, Tuebingen, Germany)
Why we should prefer simple causal models
Jun 17, 2021 Ziv Goldfeld (Cornell University)
Scaling Wasserstein distances to high dimensions via smoothing
Jun 10, 2021 Isabel Valera (Saarland University and MPI for Intelligent Systems)
Algorithmic recourse: theory and practice
Jun 3, 2021 Stanisław Jastrzębski (Molecule.one and Jagiellonian University)
Reverse-engineering implicit regularization due to large learning rates in deep learning
May 27, 2021 Sebastian Reich (University of Potsdam and University of Reading)
Statistical inverse problems and affine-invariant gradient flow structures in the space of probability measures
May 13, 2021 Eirikur Agustsson (Google research)
Universally Quantized Neural Compression
May 6, 2021 Haizhao Yang (Purdue University)
A Few Thoughts on Deep Learning-Based PDE Solvers
Apr 29, 2021 Mehrdad Farajtabar (DeepMind)
Catastrophic Forgetting in Continual Learning of Neural Networks
Apr 22, 2021 Matus Jan Telgarsky (UIUC)
Recent advances in the analysis of the implicit bias of gradient descent on deep networks
Apr 15, 2021 Stanislav Fort (Stanford University)
Neural Network Loss Landscapes in High Dimensions: Theory Meets Practice
Apr 8, 2021 Cristian Bodnar (University of Cambridge), Fabrizio Frasca (Twitter)
Weisfeiler and Lehman Go Topological: Message Passing Simplicial Networks
Apr 1, 2021 Roi Livni (Tel Aviv University)
Regularization, what is it good for?
Mar 25, 2021 Pratik Chaudhari (University of Pennsylvania)
Learning with few labeled data
Mar 18, 2021 Ryo Karakida (AIST (National Institute of Advanced Industrial Science and Technology), Tokyo)
Understanding Approximate Fisher Information for Fast Convergence of Natural Gradient Descent in Wide Neural Networks
Mar 11, 2021 Spencer Frei (Department of Statistics, UCLA)
Generalization of SGD-trained neural networks of any width in the presence of adversarial label noise
Mar 4, 2021 Marcus Hutter (DeepMind and Australian National University)
Learning Curve Theory
Feb 25, 2021 Zhiyuan Li (Princeton University)
Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate
Feb 18, 2021 Umut Şimşekli (INRIA - École Normale Supérieure (Paris))
Towards Building a Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks
Feb 11, 2021 Samuel L. Smith (DeepMind)
A Backward Error Analysis for Random Shuffling SGD
Feb 4, 2021 Tailin Wu (Stanford University)
Phase transitions on the universal tradeoff between prediction and compression in machine learning
Jan 28, 2021 Suriya Gunasekar (Microsoft Research, Redmond)
Rethinking the role of optimization in learning
Jan 21, 2021 Taiji Suzuki (The University of Tokyo, and Center for Advanced Intelligence Project, RIKEN, Tokyo)
Statistical efficiency and optimization of deep learning from the viewpoint of non-convexity
Jan 14, 2021 Mahdi Soltanolkotabi (University of Southern California)
Learning via early stopping and untrained neural nets
Dec 17, 2020 Tim G. J. Rudner (University of Oxford)
Outcome-Driven Reinforcement Learning via Variational Inference
Nov 20, 2020 Amartya Sanyal (University of Oxford)
How benign is benign overfitting?
Nov 12, 2020 Yasaman Bahri (Google Brain)
The Large Learning Rate Phase of Wide, Deep Neural Networks
Nov 5, 2020 Kenji Kawaguchi (MIT)
Deep learning: theoretical results on optimization and mixup
Oct 29, 2020 Yaim Cooper (Institute for Advanced Study, Princeton)
The geometry of the loss function of deep neural networks
Oct 22, 2020 Boris Hanin (Princeton University)
Neural Networks at Finite Width and Large Depth
Oct 15, 2020 Guy Blanc (Stanford University)
Provable Guarantees for Decision Tree Induction
Oct 8, 2020 Yaoyu Zhang (Shanghai Jiao Tong University)
Impact of initialization on generalization of deep neural networks
Oct 1, 2020 David Rolnick (McGill University & Mila)
Expressivity and learnability: linear regions in deep ReLU networks
Sep 25, 2020 Randall Balestriero (Rice University)
Max-Affine Spline Insights into Deep Networks
Sep 24, 2020 Liwen Zhang (University of Chicago)
Tropical Geometry of Deep Neural Networks
Sep 18, 2020 Mert Pilanci (Stanford University)
Exact Polynomial-time Convex Formulations for Training Multilayer Neural Networks: The Hidden Convex Optimization Landscape
Sep 17, 2020 Mahito Sugiyama (National Institute of Informatics, JST, PRESTO)
Learning with Dually Flat Structure and Incidence Algebra
Sep 4, 2020 Nadav Cohen (Tel Aviv University)
Analyzing Optimization and Generalization in Deep Learning via Dynamics of Gradient Descent
Sep 3, 2020 Niladri S. Chatterji (UC Berkeley)
Upper and lower bounds for gradient based sampling methods
Aug 27, 2020 Felix Draxler (Heidelberg University)
Characterizing The Role of A Single Coupling Layer in Affine Normalizing Flows
Aug 21, 2020 Preetum Nakkiran (Harvard University)
Distributional Generalization: A New Kind of Generalization
Aug 20, 2020 Arthur Jacot (École Polytechnique Fédérale de Lausanne)
Neural Tangent Kernel: Convergence and Generalization of DNNs
Aug 14, 2020 Ido Nachum (École Polytechnique Fédérale de Lausanne)
Regularization by Misclassification in ReLU Neural Networks
Aug 13, 2020 Lénaïc Chizat (CNRS - Laboratoire de Mathématiques d'Orsay)
Analysis of Gradient Descent on Wide Two-Layer ReLU Neural Networks
Aug 6, 2020 Simon S. Du (University of Washington)
Ultra-wide Neural Network and Neural Tangent Kernel
Jul 31, 2020 Greg Ongie (University of Chicago)
A function space view of overparameterized neural networks
Jul 30, 2020 Robert Peharz (TU Eindhoven)
Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters
Jul 23, 2020 Léonard Blier (Facebook AI Research, Université Paris Saclay, Inria)
The Description Length of Deep Learning Models
Jul 10, 2020 Nasim Rahaman (MPI-IS Tübingen, and Mila, Montréal)
On the Spectral Bias of Neural Networks
Jul 3, 2020 Kai Fong Ernest Chong (Singapore University of Technology and Design)
The approximation capabilities of neural networks
Jul 2, 2020 Wenda Zhou (Columbia University)
New perspectives on cross-validation
Jun 25, 2020 Adam Gaier (INRIA)
Weight Agnostic Neural Networks
Jun 18, 2020 Alessandro Achille (University of California, Los Angeles)
Structure of Learning Tasks and the Information in the Weights of a Deep Network
Jun 11, 2020 Poorya Mianjy (Johns Hopkins University)
Understanding the Algorithmic Regularization due to Dropout
Jun 4, 2020 Ulrich Terstiege (Rheinisch-Westfälische Technische Hochschule Aachen)
Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers
May 28, 2020 Jonathan Frankle (Massachusetts Institute of Technology)
The Lottery Ticket Hypothesis: On Sparse, Trainable Neural Networks
May 21, 2020 Dennis Elbrächter (Universität Wien)
How degenerate is the parametrization of (ReLU) neural networks?
May 14, 2020 Benjamin Fehrman (University of Oxford)
Convergence rates for the stochastic gradient descent method for non-convex objective functions
May 7, 2020 Kathlén Kohn (KTH Royal Institute of Technology)
The geometry of neural networks
Apr 30, 2020 Quynh Nguyen (University Saarbruecken)
Loss surface of deep and wide neural networks
Apr 23, 2020 Johannes Mueller (MPI MiS, Leipzig)
Deep Ritz Revisited
Apr 16, 2020 Anton Mallasto (University of Copenhagen)
Estimation of Optimal Transport in Generative Models
Apr 9, 2020 Michael Arbel (University College London)
Kernelized Wasserstein Natural Gradient
Apr 2, 2020 Yu Guang Wang (University of New South Wales, Sydney)
Haar Graph Pooling

Administrative Contact

Katharina Matschke

MPI for Mathematics in the Sciences Contact via Mail