Events
Workshop on Algebraic Geometry, Combinatorics, and Machine Learning

MPI für Mathematik in den Naturwissenschaften Leipzig

E1 05 (Leibniz-Saal)

conference

08.08.22

Workshop on Algebraic Geometry, Combinatorics, and Machine Learning

Numerous learning algorithms rely on the geometry and combinatorics of individual hypothesis functions, function classes, their particular parametrization, and the corresponding structure of optimization objectives in parameter and function space, which also depend on the data at hand. In this meeting we will discuss the application of mathematical methods and challenges in Algebraic Geometry and Combinatorics in the realm of Machine Learning and Deep Learning in particular.

Participation is possible both in-person and online.

Speakers

Christoph Hertrich

London School of Economics

Kathlén Kohn

KTH Royal Institute of Technology

Johannes Müller

Max Planck Institute for Mathematics in the Sciences

Yue Ren

University of Durham

Rishi Sonthalia

University of California, LA

Jayden Wang

University of Texas, Austin

Program

13:20 - 13:30	Guido Montúfar (Max Planck Institute for Mathematics in the Sciences) Welcome
13:30 - 14:00	Yue Ren (University of Durham) Tropical geometry and linear regions of maxout networks The talk is split into two parts. In the first part, we discuss bounding the number of linear regions via tropical maxout arrangements. In the second part, we give an overview on ongoing experiments on the initializing maxout networks. The first part is joint work with Guido Montufar and Leon Zhang, the second part is joint work with Iolo Jones and Ziva Urbancic.
14:00 - 14:30	Johannes Müller (Max Planck Institute for Mathematics in the Sciences) From Markov decision processes to algebraic statistics We study the reward maximization problem in partially observable Markov decision processes with stationary stochastic policies, which constitute an important model in sequential decision making under uncertainty. We obtain a description of the problem as the optimization of a linear objective over the probability simplex subject to polynomial constraints that we characterize explicitly. This allows us to describe the combinatorial and algebraic complexity of the problem and we obtain bounds on its algebraic degree. Further, we conduct experiments using methods from numerical algebra to solve the critical equations and compare the results to standard gradient based optimization.
14:30 - 15:00	Rishi Sonthalia (University of California, LA) Universality of Generalizations of Restricted Boltzmann Machines Restricted Boltzmann machines (RBM) are graphical models that can be used to model probability distributions on the corners of the n dimensional unit cube. It can be shown that the distribution modeled by an RBM with m hidden nodes can parameterized as the Hadamard product of m tensors (called factors) that are super modular and have flattening rank at most 2. Inspired by this we look at two different models obtained by either removing the super modularity or the flattening rank condition. For both models, we try to answer the following question. Given an arbitrary tensor p what is the maximum number of tensors m so that p can be represented as the Hadamard product of m factors? We show that if p has strictly positive entries then it can be represented using n super modular factors. On the hand, for the model where our factors have flattening rank at most 2, we present lower and upper bounded for m. Using the proof idea, if we look at representing M by N matrices as Hadamard products of m rank r matrices, we present a new proof for the lower bound from Friedenberg, Onetto, and Williams 2017.
15:00 - 15:30	Break
15:30 - 16:00	Kathlén Kohn (KTH Royal Institute of Technology) The Geometry of Linear Convolutional Networks We discuss linear convolutional neural networks (LCNs) and their critical points. We observe that the function space (i.e., the set of functions represented by LCNs) can be identified with polynomials that admit certain factorizations, and we use this perspective to describe the impact of the network’s architecture on the geometry of the function space. For instance, for LCNs with one-dimensional convolutions having stride one and arbitrary filter sizes, we provide a full description of the boundary of the function space. We further study the optimization of an objective function over such LCNs: We characterize the relations between critical points in function space and in parameter space and show that there do exist spurious critical points. We compute an upper bound on the number of critical points in function space using Euclidean distance degrees and describe dynamical invariants for gradient descent. This talk is based on joint work with Thomas Merkh, Guido Montúfar, and Matthew Trager.
16:00 - 16:30	Representing piecewise linear functions minimally: Online Piecewise linear functions are ubiquitous in tropical geometry, convex geometry, and machine learning. Every Piecewise Linear function can be represented as a tropical rational function (difference of max-affine functions), whereas not much has been done to define minimal representations. We propose two minimality concepts, induced by the so-called monomial length and factorization length of tropical polynomials. We will demonstrate that in practice, factorization length is better behaved in terms of solving the minimal representation of a piecewise linear function, due to the nice properties it satisfies, while the monomial length is related to extremal problems of equivalence classes of polytope pairs. The difference of those two concepts is controlled by arrangements of tropical hypersurfaces. We obtain a different proof of the lower bound by Tseran and Montúfar for the number of regions cut out by general tropical hypersurfaces and, as a byproduct, a lower bound for the number of vertices of Minkowski sums of polytopes.
16:30 - 17:00	Lower Bounds on Neural Network Depth via (Lattice) Polytopes: Online This talk will be centered around the following open problem about neural networks with rectified linear units (ReLUs): What is the minimum number of hidden layers required to represent every continuous piecewise linear function with input dimension n? While it is known that log(n) layers are always sufficient, no lower bound larger than 2 has been proven so far. We conjecture that the logarithmic upper bound is indeed the true answer. We provide evidence for our conjecture by proving special cases using methods from mixed-integer programming and polyhedral theory. In particular, we demonstrate how previously discovered connections between neural networks and tropical geometry can be used to translate the problem into the language of Newton polytopes. Using theory of lattice polytopes, we then prove the conjecture under some integrality assumption. This talk is based on joint works with Amitabh Basu, Marco Di Summa, and Martin Skutella, as well as with Christian Haase and Georg Loho.
17:00 - 18:00	Discussion
18:00 - 22:00	Dinner (city/individual basis)

Participants

Carlos Améndola

TU Berlin / MPI-MiS

Pradeep Banerjee

MPI for Mathematics in the Sciences

Mara Belotti

Technische universität berlin

Tobias Boege

MPI-MiS

Marie Brandenburg

MPI MiS

Axel Flinth

Umeå University

Moritz Grillo

TU Berlin

Marc Härkönen

Max Planck Institute for Mathematics in the Sciences

Christoph Hertrich

London School of Economics

Ben Hollering

Max Planck Institute for Mathematics in the Sciences

Naihuan Jing

North Carolina State University

Kathlén Kohn

KTH Royal Institute of Technology

Julia Lindberg

MPI

Guido Montúfar

Max Planck Institute for Mathematics in the Sciences

Johannes Müller

Max Planck Institute for Mathematics in the Sciences

Gabriel Ong

Bowdoin College

Yue Ren

University of Durham

Kemal Rose

MPI mis

Anna-Laura Sattelberger

MPI MiS

Anna Seigal

Harvard University

Linus Sommer

Universität Leipzig

Rishi Sonthalia

University of California, LA

Bernd Sturmfels

MPI Leipzig

Mikael Tchato

African institute for mathematical science

Angelica Torres

KTH Royal Institute of Technology

Hanna Tseran

MPI MiS

Jayden Wang

University of Texas, Austin

Scientific Organizers

Guido Montúfar

Max Planck Institute for Mathematics in the Sciences

Administrative Contact

Katharina Matschke

Max Planck Institute for Mathematics in the Sciences Contact via Mail