Events
Conference on Mathematics of Machine Learning

Center for Interdisciplinary Research (ZiF), Bielefeld University

Plenarsaal

conference

04.08.21 07.08.21

Conference on Mathematics of Machine Learning

conference

04.08.21 07.08.21

Conference on Mathematics of Machine Learning

The conference will take place in a hybrid format, allowing both in person and online attendance. The submission of contributed talks and posters is now closed.

In recent years, machine learning algorithms have seen an unprecedented success in a variety of fields, ranging from science and engineering to medicine and social sciences. As a result, artificial intelligence has been identified as one of the key technologies for future social and economic advance. With its rising importance also in failure sensitive systems, for example in medical devices and autonomous driving, the need for a systematic understanding of the functionality of machine learning algorithms and for guarantees on their accuracy and precision becomes vital. This calls for the development of a mathematical understanding of the structures and mechanism underlying the success of machine learning techniques. This endeavor has seen a significant increase of interest and efforts in recent years, crucially relying on interdisciplinary interaction, communicating mathematical progress and empirical research. The proposed conference is aimed as a contribution to this rapidly developing field, by bringing together experts from various mathematical areas with shared interest in applications to machine learning and experts from fields such as computer science and biology.

Please see also the poster overview at Poster Session

Speakers

Yasaman Bahri

Google Brain, USA

Pierre Baldi

University of California, Irvine, USA

Peter Bartlett

University of California, Berkeley, USA

Minh Hà Quang

RIKEN-AIP, Japan

Matthias Hein

University of Tübingen, Germany

Stefanie Jegelka

Massachusetts Institute of Technology, USA

Arnulf Jentzen

Universität Münster, Germany

Kathlén Kohn

KTH - Royal Institute of Technology, Sweden

Věra Kůrková

Czech Academy of Sciences, Czech Republic

Jason Lee

Princeton University, USA

Stanley Osher

University of California, Los Angeles, USA

Jeffrey Pennington

Google Brain, USA

Stefano Soatto

University of California, Los Angeles, USA

Ingo Steinwart

University of Stuttgart, Germany

Dmitry Yarotsky

Skoltech, Russia

Lenka Zdeborová

EPFL - École Polytechnique Fédérale de Lausanne, Switzerland

Program

10:00 - 10:30	Welcome
10:30 - 11:00	Coffee Break
11:00 - 12:00	Lenka Zdeborová (EPFL - École Polytechnique Fédérale de Lausanne, Switzerland) Understand machine learning via exactly solvable statistical physics models
12:00 - 12:30	Youness Boutaib (RWTH Aachen University, Germany) Path classification by stochastic linear RNNs Abstract
12:30 - 13:00	Yuguang Wang (Max Planck Institute for Mathematics in the Sciences (Leipzig), Germany) How framelets enhance graph neural networks Abstract
13:00 - 14:00	Lunch Break
14:00 - 14:30	Luca Ratti (University of Genoa (Genova), Italy) Learning the optimal regularizer for linear inverse problems Abstract
14:30 - 15:00	Florent Krzakala (EPFL (Lausanne), Switzerland) Generalization & Overparametrization in Machine Learning: Rigorous Insights from Simple Models The increasing dimensionality of data in the modern machine learning age presents new challenges and opportunities. The high-dimensional settings allow one to use powerful asymptotic methods from probability theory and statistical physics to obtain precise asymptotic characterizations of the generalization errors and of the benefits of overparametrization. I will present and review some recent works in this direction, and discuss what they teach us in the broader context of generalization, double descent, and over-parameterization in modern machine learning problems.
15:00 - 15:30	Coffee Break
15:30 - 16:00	Oxana Manita (Eindhoven University of Technology, Netherlands) Dropout regularization viewed from the large deviations perspective Dropout regularisation for training neural networks turns out to be very successful in practical applications. The empirical explanation of this success is based on reducing co-adaptation of features during training. Moreover, practicionners observe that 'training with dropout converges not faster, but to a better local minimum'. However, there is hardly any mathematical understanding of these statements. In this talk I want to give a mathematical interpretation of the last statement, discuss a continuous time model of training with dropout and explain why it 'converges to a better local minimum' than in case of a conventional training.
16:00 - 16:30	Jochen Merker (HTWK Leipzig, Germany) Complexity-reduced data models beyond the classical bias-variance trade-off Abstract
16:30 - 17:00	Coffee Break
17:00 - 18:00	Peter Bartlett (University of California, Berkeley, USA) Benign overfitting Abstract
18:00 - 19:00	Stefano Soatto (University of California, Los Angeles, USA) The Information in Optimal Representations Abstract
19:00 - 22:00	Dinner @ ZIF

09:30 - 10:30	Ingo Steinwart (University of Stuttgart, Germany) to be announced
10:30 - 11:00	Coffee Break
11:00 - 12:00	Dmitry Yarotsky (Skoltech, Russia) Universal scaling laws in the gradient descent training of neural networks
12:00 - 12:30	Andreas Habring (University of Graz, Austria) A Generative Variational Model for Inverse Problems in Imaging Abstract
12:30 - 14:00	Lunch Break
14:00 - 15:00	Poster Session Unrolling versus bilevel optimization in the context of learning variational modelsNiklas BreustedtTU Braunschweig, GermanyPlease see the abstract as PDF file.Online Adaptive Learning in Energy Trading Stackelberg Games with Time-Coupling ConstraintsStyliani KampezidouGeorgia Institute of Technology (Atlanta), USAAn energy trading mechanism is proposed between a selfish energy broker (aggregator) and her selfish energy customers (prosumers). The proposed design is a Stackelberg game where the aggregator trades energy bidirectionally between the prosumers and the wholesale electricity market for profit. For the purpose of satisfying the prosumers' desired energy consumption, time-coupling constraints are introduced. The described game does not admit closed form equilibrium strategies and therefore an online adaptive learning algorithm is proposed to mitigate this challenge. The latter is scalable with the number of prosumers and does not require explicit knowledge of the prosumers' decision-making mechanisms. Experimental results that utilize real-world data from the California market are provided to demonstrate the performance of the proposed approach.Taming neural networks with TUSLA: Non-convex learning via adaptive stochastic gradient Langevin algorithmsIosif LytrasUniversity of Edinburgh, United KingdomJoint work with Attila Lovas, Miklós Rásonyi and Sotirios Sabanis Arxiv Url : https://arxiv.org/pdf/2006.14514.pdfPlease see the abstract as PDF file.The geometry of discounted stationary distributions of Markov decision processesJohannes MüllerMax Planck Institute for Mathematics in the Sciences (Leipzig), GermanyPlease see the abstract as PDF file.Investment Vs. reward in competitive gamesOren NeumannGoethe University Frankfurt am Main, GermanyPlease see the abstract as PDF file.PAC-Bayesian Estimation for High-Dimensional Multi-Index Regression with Unknown Active DimensionMaximilian SteffenUniversität Hamburg, GermanyPlease see the abstract as PDF file.Adversarial Perturbation Stability of the Layered Group Basis PursuitDavid SzeghyEötvös Loránd University (ELTE) (Budapest), HungaryPlease see the abstract as PDF file.On the Expected Complexity of Maxout NetworksHanna TseranMPI for Mathematics in the Sciences (Leipzig), GermanyPlease see the abstract as PDF file.Parametrisation Independence of the Natural Gradient in Overparametrised SystemsJesse van OostrumHamburg University of Technology, GermanyPlease see the abstract as PDF file.Natural Reweighted Wake Sleep for Convolutional NetworksCsongor-Huba VaradyMax Planck Institute for Mathematics in the Sciences (MiS) in Leipzig., GermanyPlease see the abstract as PDF file.Lifted Bregman NetworksXiaoyu WangUniversity of Cambridge, United KingdomPlease see the abstract as PDF file.Applications of associative algebras in machine learningChia ZargehUniversity of Sao Paulo, BrazilPlease see the abstract as PDF file.
15:00 - 15:30	Coffee Break
15:30 - 16:30	Stefanie Jegelka (Massachusetts Institute of Technology, USA) Learning and Generalization in Graph Neural Networks Abstract
16:30 - 17:00	Coffee Break
17:00 - 18:00	Pierre Baldi (University of California, Irvine, USA) Neural Capacity and Attention
18:00 - 19:00	Yasaman Bahri (Google Brain, USA) Dynamics & Phase Transitions in Wide, Deep Neural Networks Abstract

09:30 - 10:30	Matthias Hein (University of Tübingen, Germany) Towards neural networks which know when they don't know Abstract
10:30 - 11:00	Coffee Break
11:00 - 12:00	Kathlén Kohn (KTH - Royal Institute of Technology, Sweden) The Geometry of Linear Convolutional Networks Abstract
12:00 - 12:30	Luigi Malagò (Transylvanian Institute of Neuroscience (TINS) (Cluj-Napoca), Romania) Lagrangian and Hamiltonian Mechanics for Probabilities on the Statistical Bundle We provide an Information-Geometric formulation of Classical Mechanics on the Riemannian manifold of probability distributions, which is an affine manifold endowed with a dually-flat connection. In a non-parametric formalism, we consider the full set of positive probability functions on a finite sample space, and we provide a specific expression for the tangent and cotangent spaces over the statistical manifold, in terms of a Hilbert bundle structure that we call the Statistical Bundle. In this setting, we compute velocities and accelerations of a one-dimensional statistical model using the canonical dual pair of parallel transports and define a coherent formalism for Lagrangian and Hamiltonian mechanics on the bundle. Finally, in a series of examples, we show how our formalism provides a consistent framework for accelerated natural gradient dynamics on the probability simplex, paving the way for direct applications in optimization, game theory and neural networks. The work in based on joint collaboration with Goffredo Chirco and Giovanni Pistone https://arxiv.org/abs/2009.09431 Abstract
12:30 - 13:00	Kirandeep Kour (Max Planck Institute for Dynamics of Complex Technical Systems (Magdeburg), Germany) A Low-rank Support Tensor Network Abstract
13:00 - 14:00	Lunch Break
14:00 - 14:30	Matthias Löffler (ETH Zürich (Zurich), Switzerland) AdaBoost and robust one-bit compressed sensing Abstract
14:30 - 15:00	Benjamin Fehrman (University of Oxford, United Kingdom) Convergence rates for stochastic gradient descent algorithms in non-convex loss landscapes Abstract
15:00 - 15:30	Coffee Break
15:30 - 16:30	Terry Lyons (University of Oxford, United Kingdom) From rough paths to streamed data
16:30 - 17:00	Coffee Break
17:00 - 17:30	Michael Murray (University of Oxford (Oxford, UK), United Kingdom) Activation Function Design for Deep Networks: Linearity and Effective Initialisation Abstract
17:30 - 18:00	Daniel McKenzie (University of California, Los Angeles, USA) Learning to predict equilibria from data using fixed point networks Abstract
18:00 - 19:00	Jeffrey Pennington (Google Brain, USA) Demystifying deep learning through high-dimensional statistics
19:00 - 22:00	Conference Dinner

09:30 - 10:30	Arnulf Jentzen (Universität Münster, Germany) Convergence analysis for gradient flows in the training of artificial neural networks with ReLU activation
10:30 - 11:00	Coffee Break
11:00 - 12:00	Minh Hà Quang (RIKEN-AIP, Japan) Regularized information geometric and optimal transport distances between covariance operators and Gaussian processes Abstract
12:00 - 12:30	Hanyuan Hang (University of Twente (Enschede), Netherlands) A Combination of Ensemble Methods for Large-Scale Regression Abstract
12:30 - 13:00	Soon Hoe Lim (Nordita, KTH Royal Institute of Technology and Stockholm University, Sweden) Noisy Recurrent Neural Networks Abstract
13:00 - 14:00	Lunch Break
14:00 - 14:30	Guido Montúfar (Max Planck Institute for Mathematics in the Sciences (Leipzig), Germany) Implicit bias of gradient descent for mean squared error regression with wide neural networks We investigate gradient descent training of wide neural networks and the corresponding implicit bias in function space. For univariate regression, the solution of training a width-n shallow ReLU network is within n^−1/2 of the function which fits the training data and whose difference from the initial function has the smallest 2-norm of the second derivative weighted by a curvature penalty that depends on the probability distribution that is used to initialize the network parameters. We compute the curvature penalty function explicitly for various common initialization procedures. For multivariate regression we show an analogous result, whereby the second derivative is replaced by the Radon transform of a fractional Laplacian. For initialization schemes that yield a constant penalty function, the solutions are polyharmonic splines. Moreover, we obtain results for different activations and show that the training trajectories are captured by trajectories of smoothing splines with decreasing regularization strength. This is joint work with Hui Jin https://arxiv.org/abs/2006.07356.
14:30 - 15:00	Michael Schmischke (Chemnitz University of Technology, Germany) High-Dimensional Explainable ANOVA Approximation Abstract
15:00 - 15:30	Coffee Break
15:30 - 16:00	Sebastian Kassing (WWU Münster, Germany) Convergence of Stochastic Gradient Descent for Analytic Target Functions Abstract
16:00 - 16:30	Burim Ramosaj (TU Dortmund, Germany) Interpretable Machines - Constructing valid Prediction Intervals with Random Forest Abstract
16:30 - 17:00	Coffee Break
17:00 - 18:00	Jason Lee (Princeton University, USA) Representation Learning
18:00 - 19:00	Věra Kůrková (Czech Academy of Sciences, Czech Republic) Some implications of high-dimensional geometry for neurocomputing Abstract

Scientific Organizers

Benjamin Gess

Max Planck Institute for Mathematics in the Sciences and Universität Bielefeld

Guido Montúfar

Max Planck Institute for Mathematics in the Sciences and UCLA

Nihat Ay

Hamburg University of Technology

Conference on Mathematics of Machine Learning

Conference on Mathematics of Machine Learning

Please see also the poster overview at Poster Session

Links

Speakers

Yasaman Bahri

Pierre Baldi

Peter Bartlett

Minh Hà Quang

Matthias Hein

Stefanie Jegelka

Arnulf Jentzen

Kathlén Kohn

Věra Kůrková

Jason Lee

Stanley Osher

Jeffrey Pennington

Stefano Soatto

Ingo Steinwart

Dmitry Yarotsky

Lenka Zdeborová

Program

Scientific Organizers

Benjamin Gess

Guido Montúfar

Nihat Ay