Search
MPI for Mathematics in the Sciences
Live Stream
seminar
22.05.25 05.06.25

Math Machine Learning seminar MPI MIS + UCLA

Upcoming Events of this Seminar

Previous Events

01.05.25
Mufan Li (Princeton) : The Proportional Scaling Limit of Neural Networks
17.04.25
Justin Chen (UNC Chapel Hill) : Reverse Thinking Makes LLMs Stronger Reasoners
10.04.25
Zhenmei Shi (MongoDB + Voyage AI) : Understanding Training and Adaptation in Feature Learning: From Two-Layer Networks to Foundation Models
03.04.25
Ziqing Xu (University of Pennsylvania) : Learning Dynamics of Pretraining and Finetuning for Linear Models
27.03.25
Behrad Moniri (University of Pennsylvania) : Feature Learning in Shallow Neural Networks: From Theory to Optimization Algorithms
20.03.25
Jingyang Lyu (University of Wisconsin–Madison) : Why are the logits of trained models distorted? A theory of overfitting for imbalanced classification
13.02.25
Joe Kileel (UT Austin) : Geometry and Optimization of Shallow Polynomial Networks
06.02.25
Alvaro Ribot (Harvard University) : Decomposing tensors via rank-one approximations
30.01.25
Francesco Vaccarino (Politecnico di Torino) : Topological obstruction to the training of shallow ReLU neural networks
23.01.25
Baran Hashemi (Technical University of Munich) : Can Transformers Do Enumerative Geometry?
16.01.25
Sara-Viola Kuntz (Technical Universty of Munich) : Analysis of the Geometric Structure of Neural Networks and Neural ODEs via Morse Functions
19.12.24
Petros Maragos (National Technical University of Athens) : Tropical Geometry and its Applications to Machine Learning
05.12.24
Jing An (Duke) : On the convergence of two-timescale learning algorithms
21.11.24
Massimiliano Datres (University of Trento) : A two-scale Complexity Measure for Deep Learning Models
07.11.24
Enrico Malatesta (Bocconi University) : The manifold of low energy states in neural networks: structure and barrier to algorithms
31.10.24
Gabor Lugosi (Universitat Pompeu Fabra) : Structure learning and property testing in graphical models
24.10.24
Nate Gillman (Brown University) : Mode Collapse in Self-Consuming Generative Models
17.10.24
Giovanni Paolini (University of Bologna) : Mathematics in the Age of Generative AI
10.10.24
Zehua Lai (UT Austin) : ReLU transformers and piecewise polynomials
19.09.24
Morgane Austern (Harvard) : Transport distances and novel uncertainty quantification results
05.09.24
David Holzmüller (INRIA Paris) : Generalization Theory of Linearized Neural Networks
29.08.24
Yoni Dukler (AWS AI) : A review of LLMs from the perspective of memory and compute
22.08.24
Guohao Shen (Hong Kong Polytechnic) : Exploring the Complexity of Deep Neural Networks through Functional Equivalence
15.08.24
Liam Madden (UBC) : Upper and lower memory capacity bounds of transformers for next-token prediction
08.08.24
Pan Lu (UCLA) : Advancing Mathematical Reasoning with Language Models
01.08.24
18.07.24
Tingting Tang (San Diego State University Imperial Valley) : Using machine learning methods to approximate real discriminant loci
11.07.24
Ruihan Wu (UCSD) : Harnessing Online and Bandit Algorithms for Two Practical Challenges
13.06.24
Vahid Shahverdi (KTH) : Algebraic Complexity and Neurovariety of Linear Convolutional Networks
06.06.24
Grigoris Chrysos (UW Madison) : Are activation functions required for learning in all deep networks?
30.05.24
Mariya Toneva (MPI SWS) : Language modeling beyond language modeling
23.05.24
Yiqiao Zhong (UW Madison) : Why interpolating neural nets generalize well: recent insights from neural tangent model
16.05.24
Rémi Gribonval (University Lyon) : Conservation Laws for Gradient Flows
09.05.24
Axel Flinth (Umea University) : Achieving equivariance in neural networks
02.05.24
Etienne Boursier (INRIA Paris-Saclay) : Early Alignment in two-layer networks training is a two edge sword
25.04.24
Johannes Maly (LMU München) : On the relation between over- and reparametrization of gradient descent
18.04.24
Jörg Lücke (University of Oldenburg) : On the Convergence and Transformation of the ELBO Objective to Entropy Sums
11.04.24
Semih Cayci (RWTH Aachen) : A Non-Asymptotic Analysis of Gradient Descent for Recurrent Neural Networks
04.04.24
Joachim Bona-Pellissier (MaLGa Universita di Genova) : Geometry-induced Implicit Regularization in Deep ReLU Neural Networks
21.03.24
Hossein Taheri (UCSB) : Generalization and Stability in Interpolating Neural Networks
14.03.24
Courtney Paquette (McGill University) : Hitting the High-D(imensional) Notes: An ODE for SGD learning dynamics in high-dimensions
07.03.24
Noam Razin (Tel Aviv University) : Analyses of Policy Gradient for Language Model Finetuning and Optimal Control
22.02.24
Tim Rudner (New York University) : Data-Driven Priors for Trustworthy Machine Learning
15.02.24
Maksym Andriushchenko (EPFL) : A Modern Look at the Relationship between Sharpness and Generalization
08.02.24
Julius Berner (Caltech) : Learning ReLU networks to high uniform accuracy is intractable
01.02.24
Yulia Alexandr (University of Califorina, Berkeley) : Maximum information divergence from linear and toric models
25.01.24
Aaron Mishkin (Stanford University) : Optimal Sets and Solution Paths of ReLU Networks
18.01.24
Agustinus Kristiadi (Vector Institute) : The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
07.12.23
Molei Tao (Georgia Institute of Technology) : Implicit biases of large learning rates in machine learning
30.11.23
Pan Li (Georgia Tech) : The Stability of Positional Encoding for Graph Neural Networks and Graph Transformers
16.11.23
Sjoerd Dirksen (Utrecht University) : The separation capacity of random neural networks
09.11.23
Courtney Paquette (McGill University) : Hitting the High-D(imensional) Notes: An ODE for SGD learning dynamics in high-dimensions
Attention: this event is cancelled
02.11.23
Oscar Leong (Caltech) : Theory and Algorithms for Deep Generative Networks in Inverse Problems
19.10.23
Daniel Murfet (University of Melbourne) : Phase transitions in learning machines
12.10.23
Micah Goldblum (NYU) : Bridging the gap between deep learning theory and practice
05.10.23
Michael Murray (UCLA) : Training shallow ReLU networks on noisy data using hinge loss: when do we overfit and is it benign?
21.09.23
Florian Schaefer (Georgia Tech) : Solvers, Models, Learners: Statistical Inspiration for Scientific Computing
14.09.23
Vidya Muthukumar (Georgia Institute of Technology) : Classification versus regression in overparameterized regimes: Does the loss function matter?
07.09.23
Rahul Parhi (EPFL) : Deep Learning Meets Sparse Regularization
31.08.23
Tom Tirer (Bar-Ilan University) : Exploring Deep Neural Collapse via Extended and Penalized Unconstrained Features Models
17.08.23
Surbhi Goel (University of Pennsylvania) : Beyond Worst-case Sequential Prediction: Adversarial Robustness via Abstention
03.08.23
Haoyuan Sun (MIT) : A Unified Approach to Controlling Implicit Regularization via Mirror Descent
27.07.23
Sanghamitra Dutta (University of Maryland) : Foundations of Reliable and Lawful Machine Learning: Approaches Using Information Theory
20.07.23
Sitan Chen (Harvard University) : Theory for Diffusion Models
13.07.23
Behrooz Tahmasebi (MIT) : Sample Complexity Gain from Invariances: Kernel Regression, Wasserstein Distance, and Density Estimation
06.07.23
Ronen Basri (Weizmann Institute of Science) : On the Connection between Deep Neural Networks and Kernel Methods
29.06.23
Bingbin Liu (Carnegie Mellon University) : Thinking Fast with Transformers: Algorithmic Reasoning via Shortcuts
15.06.23
Benjamin Bowman (UCLA and AWS) : On the Spectral Bias of Neural Networks in the Neural Tangent Kernel Regime
08.06.23
Simone Bombari (IST Austria) : Optimization, Robustness and Privacy. A Story through the Lens of Concentration
01.06.23
Rishi Sonthalia (UCLA) : Least Squares Denoising: Non-IID Data, Transfer Learning and Under-parameterized Double Descent
25.05.23
Łukasz Dębowski (IPI PAN) : A Simplistic Model of Neural Scaling Laws: Multiperiodic Santa Fe Processes
18.05.23
Duc N.M Hoang (University of Texas, Austin) : Revisiting Pruning as Initialization through the Lens of Ramanujan Graph
11.05.23
Henning Petzka (RWTH Aachen University) : Relative flatness and generalization
27.04.23
Stéphane d'Ascoli (ENS and FAIR Paris) : Double descent: insights from the random feature model
20.04.23
Aditya Grover (UCLA) : Generative Decision Making Under Uncertainty
13.04.23
Wei Zhu (University of Massachusetts Amherst) : Implicit Bias of Linear Equivariant Steerable Networks
06.04.23
Jaehoon Lee (Google Brain) : Exploring Infinite-Width Limit of Deep Neural Networks
23.03.23
Dohyun Kwon (University of Wisconsin-Madison) : Convergence of score-based generative modeling
16.03.23
Chulhee Yun (KAIST) : Shuffling-based stochastic optimization methods: bridging the theory-practice gap
09.03.23
Ricardo S. Baptista (Caltech) : On the representation and learning of triangular transport maps
02.03.23
Samy Wu Fung (Colorado School of Mines) : Using Hamilton Jacobi PDEs in Optimization
23.02.23
Jalaj Bhandari (Columbia University) : Global guarantees for policy gradient methods.
16.02.23
Julia Elisenda Grigsby (Boston College) : Functional dimension of ReLU Networks
09.02.23
Renjie Liao (University of British Columbia) : Gaussian-Bernoulli RBMs Without Tears
02.02.23
Mihaela Rosca (DeepMind) : On continuous time models of gradient descent and instability in deep learning
26.01.23
Liu Ziyin (University of Tokyo) : The Probabilistic Stability and Low-Rank Bias of SGD
19.01.23
Xiaowu Dai (UCLA) : Kernel Ordinary Differential Equations
12.01.23
Felipe Suárez-Colmenares (MIT) : Learning threshold neurons via the "edge of stability”
16.12.22
Jona Lelmi (University of Bonn) : Large data limit of the MBO scheme for data clustering
15.12.22
Nicolás García Trillos (University of Wisconsin-Madison) : Analysis of adversarial robustness and of other problems in modern machine learning
08.12.22
Mojtaba Sahraee-Ardakan (UCLA) : Equivalence of Kernel Methods and Linear Models in High Dimensions
01.12.22
Levon Nurbekyan (UCLA) : Efficient natural gradient method for large-scale optimization problems
17.11.22
Sebastian Goldt (SISSA Trieste) : On the importance of higher-order input statistics for learning in neural networks
10.11.22
El Mehdi Achour (IMT Toulouse) : The loss landscape of deep linear neural networks: A second-order analysis
03.11.22
Itay Safran (Purdue University) : On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
27.10.22
Lechao Xiao (Google Brain) : Eigenspace restructuring: A Principle of space and frequency in neural networks
20.10.22
Joe Kileel (University of Texas at Austin) : The Effect of Parametrization on Nonconvex Optimization Landscapes
13.10.22
Chao Ma (Stanford University) : Implicit bias of optimization algorithms for neural networks: static and dynamic perspectives
06.10.22
Felix Dietrich (Technical University of Munich) : Learning dynamical systems from data
22.09.22
Gintare Karolina Dziugaite (Google Brain) : Deep Learning through the Lens of Data
15.09.22
Lisa Maria Kreusser (University of Bath) : Wasserstein GANs Work Because They Fail (to Approximate the Wasserstein Distance)
08.09.22
Devansh Arpit (Salesforce) : Optimization Aspects that Improve IID and OOD Generalization in Deep Learning
25.08.22
Melanie Weber (Harvard University) : Exploiting geometric structure in (matrix-valued) optimization
18.08.22
Aditya Golatkar (UCLA) : Selective Forgetting and Mixed Privacy in Deep Networks
12.08.22
Francesco Di Giovanni (Twitter) : Over-squashing and over-smoothing through the lenses of curvature and multi-particle dynamics
04.08.22
Wei Hu (University of California, Berkeley) : More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize
28.07.22
Arash A. Amini (UCLA) : Target alignment in truncated kernel ridge regression
21.07.22
Mark Schmidt (University of British Columbia) : Optimization Algorithms for Training Over-Parameterized Models
14.07.22
Rong Ge (Duke University) : Towards Understanding Training Dynamics for Mildly Overparametrized Models
07.07.22
Sebastian Kassing (University of Bielefeld) : Convergence of Stochastic Gradient Descent for analytic target functions
23.06.22
Lisa Maria Kreusser (University of Bath) : Wasserstein GANs Work Because They Fail (to Approximate the Wasserstein Distance)
Attention: this event is cancelled
16.06.22
Jingfeng Wu (Johns Hopkins University) : A Fine-Grained Characterization for the Implicit Bias of SGD in Least Square Problems
09.06.22
Guang Cheng (UCLA) : Nonparametric Perspective on Deep Learning
02.06.22
Tan Nguyen (UCLA) : Principled Models for Machine Learning
26.05.22
Melikasadat Emami (UCLA) : Asymptotics of Learning in Generalized Linear Models and Recurrent Neural Networks
19.05.22
Sayan Mukherjee (MPI MiS, Leipzig + Universität Leipzig) : Inference in dynamical systems and the geometry of learning group actions
12.05.22
Zixiang Chen (UCLA) : Benign Overfitting in Two-layer Convolutional Neural Networks
05.05.22
Noam Razin (Tel Aviv University) : Generalization in Deep Learning Through the Lens of Implicit Rank Minimization
21.04.22
Berfin Şimşek (Ecole Polytechnique Fédérale de Lausanne (EPFL)) : The Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetry-Induced Saddles and Global Minima Manifold
14.04.22
Christoph Lampert (Institute of Science and Technology, Austria) : Robust and Fair Multisource Learning
07.04.22
Chaoyue Liu (Ohio State University) : Transition to Linearity of Wide Neural Networks
31.03.22
Benjamin Eysenbach (Carnegie Mellon University & Google Brain) : The Information Geometry of Unsupervised Reinforcement Learning
24.03.22
Ohad Shamir (Weizmann Institute of Science) : Implicit bias in machine learning
17.03.22
Gabin Maxime Nguegnang (RWTH Aachen University) : Convergence of gradient descent for learning deep linear neural networks
11.03.22
Uri Alon (CMU) : Bottlenecks, over-squashing, and attention in Graph Neural Networks.
10.03.22
Gergely Neu (Universitat Pompeu Fabra, Barcelona) : Generalization Bounds via Convex Analysis
04.03.22
Daniel McKenzie (UCLA) : Implicit Neural Networks: What they are and how to use them.
03.03.22
Marco Mondelli (Institute of Science and Technology Austria) : Gradient Descent for Deep Neural Networks: New Perspectives from Mean-field and NTK
24.02.22
Danijar Hafner (Google Brain & University of Toronto) : General Infomax Agents through World Models
17.02.22
Maksim Velikanov (Skolkovo Institute of Science and Technology) : Spectral properties of wide neural networks and their implications to the convergence speed of different Gradient Descent algorithms
10.02.22
Hui Jin (University of California, Los Angeles) : Learning curves for Gaussian process regression with power-law priors and targets
04.02.22
Song Mei (University of California, Berkeley) : The efficiency of kernel methods on structured datasets
27.01.22
Daniel Russo (Columbia University) : Adaptivity and Confounding in Multi-armed Bandit Experiments
20.01.22
Philipp Petersen (University of Vienna) : Optimal learning in high-dimensional classification problems
13.01.22
Yuan Cao (University of California, LA) : Towards Understanding and Advancing the Generalization of Adam in Deep Learning
16.12.21
Zhi-Qin John Xu (Shanghai Jiao Tong University) : Occam’s razors in neural networks: frequency principle in training and embedding principle of loss landscape
02.12.21
Ekaterina Lobacheva (HSE University), Maxim Kodryan (HSE University) : On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay
25.11.21
Michael Joswig (TU Berlin) : Geometric Disentanglement by Random Convex Polytopes
18.11.21
Daniel Soudry (Technion) : Algorithmic Bias Control in Deep learning
11.11.21
Christoph Hertrich (TU Berlin) : Towards Lower Bounds on the Depth of ReLU Neural Networks
04.11.21
Hyeyoung Park (Kyungpook National University) : Effect of Geometrical Singularity on Learning Dynamics of Neural Networks
21.10.21
David Stutz (MPI for Informatics, Saarbrücken) : Relating Adversarial Robustness and Weight Robustness Through Flatness
14.10.21
Mathias Drton (Technical University of Munich) : A Bayesian information criterion for singular models
07.10.21
Roger Grosse (University of Toronto) : Self-tuning networks: Amortizing the hypergradient computation for hyperparameter optimization
23.09.21
Huan Xiong (MBZUAI, Abu Dhabi) : On the Number of Linear Regions of Convolutional Neural Networks with Piecewise Linear Activations
16.09.21
Pavel Izmailov (New York University) : What Are Bayesian Neural Network Posteriors Really Like?
09.09.21
Shaowei Lin (working in a stealth startup) : All you need is relative information
02.09.21
Matthew Trager (Amazon US) : A Geometric View of Functional Spaces of Neural Networks
26.08.21
Guodong Zhang (University of Toronto) : Differentiable Game Dynamics: Hardness and Complexity of Equilibrium Learning
22.07.21
Soledad Villar (Johns Hopkins University) : Scalars are universal: Gauge-equivariant machine learning, structured like classical physics
15.07.21
Hossein Mobahi (Google Research) : Self-Distillation Amplifies Regularization in Hilbert Space
08.07.21
Roger Grosse (University of Toronto) : Self-tuning networks: Amortizing the hypergradient computation for hyperparameter optimization
01.07.21
Dominik Janzing (Amazon Research, Tuebingen, Germany) : Why we should prefer simple causal models
17.06.21
Ziv Goldfeld (Cornell University) : Scaling Wasserstein distances to high dimensions via smoothing
10.06.21
Isabel Valera (Saarland University and MPI for Intelligent Systems) : Algorithmic recourse: theory and practice
03.06.21
Stanisław Jastrzębski (Molecule.one and Jagiellonian University) : Reverse-engineering implicit regularization due to large learning rates in deep learning
27.05.21
Sebastian Reich (University of Potsdam and University of Reading) : Statistical inverse problems and affine-invariant gradient flow structures in the space of probability measures
13.05.21
Eirikur Agustsson (Google research) : Universally Quantized Neural Compression
06.05.21
Haizhao Yang (Purdue University) : A Few Thoughts on Deep Learning-Based PDE Solvers
29.04.21
Mehrdad Farajtabar (DeepMind) : Catastrophic Forgetting in Continual Learning of Neural Networks
22.04.21
Matus Jan Telgarsky (UIUC) : Recent advances in the analysis of the implicit bias of gradient descent on deep networks
15.04.21
Stanislav Fort (Stanford University) : Neural Network Loss Landscapes in High Dimensions: Theory Meets Practice
08.04.21
Cristian Bodnar (University of Cambridge), Fabrizio Frasca (Twitter) : Weisfeiler and Lehman Go Topological: Message Passing Simplicial Networks
01.04.21
Roi Livni (Tel Aviv University) : Regularization, what is it good for?
25.03.21
Pratik Chaudhari (University of Pennsylvania) : Learning with few labeled data
18.03.21
Ryo Karakida (AIST (National Institute of Advanced Industrial Science and Technology), Tokyo) : Understanding Approximate Fisher Information for Fast Convergence of Natural Gradient Descent in Wide Neural Networks
11.03.21
Spencer Frei (Department of Statistics, UCLA) : Generalization of SGD-trained neural networks of any width in the presence of adversarial label noise
04.03.21
Marcus Hutter (DeepMind and Australian National University) : Learning Curve Theory
25.02.21
Zhiyuan Li (Princeton University) : Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate
18.02.21
Umut Şimşekli (INRIA - École Normale Supérieure (Paris)) : Towards Building a Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks
11.02.21
Samuel L. Smith (DeepMind) : A Backward Error Analysis for Random Shuffling SGD
04.02.21
Tailin Wu (Stanford University) : Phase transitions on the universal tradeoff between prediction and compression in machine learning
28.01.21
Suriya Gunasekar (Microsoft Research, Redmond) : Rethinking the role of optimization in learning
21.01.21
Taiji Suzuki (The University of Tokyo, and Center for Advanced Intelligence Project, RIKEN, Tokyo) : Statistical efficiency and optimization of deep learning from the viewpoint of non-convexity
14.01.21
Mahdi Soltanolkotabi (University of Southern California) : Learning via early stopping and untrained neural nets
17.12.20
Tim G. J. Rudner (University of Oxford) : Outcome-Driven Reinforcement Learning via Variational Inference
20.11.20
Amartya Sanyal (University of Oxford) : How benign is benign overfitting?
12.11.20
Yasaman Bahri (Google Brain) : The Large Learning Rate Phase of Wide, Deep Neural Networks
05.11.20
Kenji Kawaguchi (MIT) : Deep learning: theoretical results on optimization and mixup
29.10.20
Yaim Cooper (Institute for Advanced Study, Princeton) : The geometry of the loss function of deep neural networks
22.10.20
Boris Hanin (Princeton University) : Neural Networks at Finite Width and Large Depth
15.10.20
Guy Blanc (Stanford University) : Provable Guarantees for Decision Tree Induction
08.10.20
Yaoyu Zhang (Shanghai Jiao Tong University) : Impact of initialization on generalization of deep neural networks
01.10.20
David Rolnick (McGill University & Mila) : Expressivity and learnability: linear regions in deep ReLU networks
25.09.20
Randall Balestriero (Rice University) : Max-Affine Spline Insights into Deep Networks
24.09.20
Liwen Zhang (University of Chicago) : Tropical Geometry of Deep Neural Networks
18.09.20
Mert Pilanci (Stanford University) : Exact Polynomial-time Convex Formulations for Training Multilayer Neural Networks: The Hidden Convex Optimization Landscape
17.09.20
Mahito Sugiyama (National Institute of Informatics, JST, PRESTO) : Learning with Dually Flat Structure and Incidence Algebra
04.09.20
Nadav Cohen (Tel Aviv University) : Analyzing Optimization and Generalization in Deep Learning via Dynamics of Gradient Descent
03.09.20
Niladri S. Chatterji (UC Berkeley) : Upper and lower bounds for gradient based sampling methods
27.08.20
Felix Draxler (Heidelberg University) : Characterizing The Role of A Single Coupling Layer in Affine Normalizing Flows
21.08.20
Preetum Nakkiran (Harvard University) : Distributional Generalization: A New Kind of Generalization
20.08.20
Arthur Jacot (École Polytechnique Fédérale de Lausanne) : Neural Tangent Kernel: Convergence and Generalization of DNNs
14.08.20
Ido Nachum (École Polytechnique Fédérale de Lausanne) : Regularization by Misclassification in ReLU Neural Networks
13.08.20
Lénaïc Chizat (CNRS - Laboratoire de Mathématiques d'Orsay) : Analysis of Gradient Descent on Wide Two-Layer ReLU Neural Networks
06.08.20
Simon S. Du (University of Washington) : Ultra-wide Neural Network and Neural Tangent Kernel
31.07.20
Greg Ongie (University of Chicago) : A function space view of overparameterized neural networks
30.07.20
Robert Peharz (TU Eindhoven) : Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters
23.07.20
Léonard Blier (Facebook AI Research, Université Paris Saclay, Inria) : The Description Length of Deep Learning Models
10.07.20
Nasim Rahaman (MPI-IS Tübingen, and Mila, Montréal) : On the Spectral Bias of Neural Networks
03.07.20
Kai Fong Ernest Chong (Singapore University of Technology and Design) : The approximation capabilities of neural networks
02.07.20
Wenda Zhou (Columbia University) : New perspectives on cross-validation
25.06.20
Adam Gaier (INRIA) : Weight Agnostic Neural Networks
18.06.20
Alessandro Achille (University of California, Los Angeles) : Structure of Learning Tasks and the Information in the Weights of a Deep Network
11.06.20
Poorya Mianjy (Johns Hopkins University) : Understanding the Algorithmic Regularization due to Dropout
04.06.20
Ulrich Terstiege (Rheinisch-Westfälische Technische Hochschule Aachen) : Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers
28.05.20
Jonathan Frankle (Massachusetts Institute of Technology) : The Lottery Ticket Hypothesis: On Sparse, Trainable Neural Networks
21.05.20
Dennis Elbrächter (Universität Wien) : How degenerate is the parametrization of (ReLU) neural networks?
14.05.20
Benjamin Fehrman (University of Oxford) : Convergence rates for the stochastic gradient descent method for non-convex objective functions
07.05.20
Kathlén Kohn (KTH Royal Institute of Technology) : The geometry of neural networks
30.04.20
Quynh Nguyen (University Saarbruecken) : Loss surface of deep and wide neural networks
23.04.20
Johannes Mueller (MPI MiS, Leipzig) : Deep Ritz Revisited
16.04.20
Anton Mallasto (University of Copenhagen) : Estimation of Optimal Transport in Generative Models
09.04.20
Michael Arbel (University College London) : Kernelized Wasserstein Natural Gradient
02.04.20