Geometry plays a crucial role in many Machine Learning problems, as it effectively captures the underlying structure and regularity of data. Additionally, Geometry provides a versatile framework for analyzing, unifying and generalizing Machine Learning methods to new settings.
GaML-23 aims to foster interaction and collaboration among researchers and practitioners working at the intersection of Geometry and Machine Learning. The Workshop will consist of invited talks, including keynotes by well-known experts and hands-on tutorials on implementing Geometric Machine Learning algorithms.
Limited funds are available to cover some travel and accommodation costs for early-career participants, mainly targeting postdoctoral researchers and Ph.D. students. Applicants are expected to submit a brief academic CV and a statement of purpose, Ph.D. students applying for funding must additionally provide a letter of support from advisor(s). The funding application is included in the registration form.
Deadlines
Applications for funding requests are closed now.
In-person registration for the Main Venue (MPI MiS, E1 05) has reached capacity and is now closed. It is still possible to register for the live stream.
WARNING: Please be wary if a company directly contacts you regarding payment for accommodation in Leipzig: this is fraud / phishing.
In many applications, especially in the sciences, data and tasks have known invariances. Encoding such invariances directly into a machine learning model can improve learning outcomes, while it also poses challenges on efficient model design.
In the first part of the talk, we will focus on the invariances relevant to eigenvectors and eigenspaces being inputs to a neural network. Such inputs are important, for instance, for graph representation learning. We will discuss targeted architectures that can universally express functions with the relevant invariances or equivariances - sign flips and changes of basis - and their theoretical and empirical benefits.Second, we will take a broader, theoretical perspective. Empirically, it is known that encoding invariances into the machine learning model can reduce sample complexity. For the simplified setting of kernel ridge regression or random features, we will discuss new bounds that illustrate two ways in which invariances can reduce sample complexity. Our results hold for learning on manifolds and for invariances to (almost) any group action.This talk is based on joint work with Joshua Robinson, Derek Lim, Behrooz Tahmasebi, Lingxiao Zhao, Tess Smidt, Suvrit Sra and Haggai Maron.Bio: Stefanie Jegelka is a Humboldt Professor at TU Munich and and Associate Professor (on leave) in the Department of EECS at MIT. Before joining MIT, she was a postdoctoral researcher at UC Berkeley, and obtained her PhD from ETH Zurich and the Max Planck Institute for Intelligent Systems. Stefanie has received a Sloan Research Fellowship, an NSF CAREER Award, a DARPA Young Faculty Award, the German Pattern Recognition Award, a Best Paper Award at ICML and an invitation as sectional lecturer at the International Congress of Mathematicians. She has co-organized multiple workshops on (discrete) optimization in machine learning and graph representation learning, and has served as an Action Editor at JMLR and a program chair of the International Conference on Machine Learning (ICML) 2022. Her research interests span the theory and practice of algorithmic machine learning, in particular, learning problems that involve combinatorial, algebraic or geometric structure.
Artificial neural networks are for example prone to being fooled by carefully perturbed inputs which cause an egregious misclassification. Graph theory has been extensively used to model real data such as brain, social interactions and others. In this talk, I will show how graphs-based approaches help to investigate the inner workings of artificial neural networks in two different experiments: adversarial attacks and catastrophic forgetting.
We will consider modeling shapes and fields via topological and lifted-topological transforms. Specifically, we show how the Euler Characteristic Transform and the Lifted Euler Characteristic Transform can be used in practice for statistical analysis of shape and field data. We also state a moduli space of shapes for which we can provide a complexity metric for the shapes. We also provide a sheaf theoretic construction of shape space that does not require diffeomorphisms or correspondence. A direct result of this sheaf theoretic construction is that in three dimensions for meshes, 0-dimensional homology is enough to characterize the shape. We will also discuss Gaussian processes on fiber bundles and applications to evolutionary questions about shapes. Applications in biomedical imaging and evolutionary anthropology will be stated throughout the talk.
AI systems often need to process data that finds its origins in our real physical world, and is thus inherently grounded in geometry and physics. It is then evident that neural architectures capable of preserving this grounding in their representations excel in tasks involving critical geometric aspects. This observation is particularly noticeable in domains such as computational chemistry, physics, and fields like medical image analysis.
In this talk, I will argue that equivariant neural networks are precisely the class of architectures that excel in preserving this notion of geometric and physical grounding. I will present the idea that these networks can be viewed as creating neural ideograms —geometric symbols that can represent both literal geometric objects (pictograms) and more abstract concepts (ideograms).
Furthermore, I will showcase specific examples of equivariant methods that employ geometric objects (such as multi-vectors and spherical harmonics/irreps) as internal hidden states of neurons, in contrast to the conventional use of standard Euclidean vectors. I will argue that the use of such internal states for representing geometric information allows neural networks to excel in tasks demanding a form of geometric reasoning. To illustrate this, I will show an application of these techniques in the generative modelling of molecules.
As a technical contribution, in addition to the more conceptual ideas mentioned above, I will present a simple recipe to build equivariant architectures based on a generalized definition of weight sharing as conditional message passing, with interaction functions conditioned on pair-wise invariant attributes that represent equivalence classes of point-pairs.
One of the key reasons for the success of geometric deep learning is that it allows a significant dimensionality reduction thanks to the introduction of an inductive bias based on prior knowledge of the symmetries of the data studied. Parallel to the development of geometric deep learning techniques, the theory of group equivariant non-expansive operators (GENEO) has taken shape. GENEOs give a topological model to encode equivariance and can be used to design layers of neural networks. After introducing GENEOs and some applications related to them, we will discuss a generalisation to partial equivariance. The main reason for this is that, due to noise and incompleteness of data, the symmetries of a data set cannot always be modelled as a group. Thus, we need to generalise GENEOs and allow for partial equivariance to be taken into account. The space of partially equivariant non-expansive operators (P-GENEOs) maintains most of the nice features of the space of GENEO, such as compactness and convexity.
Conditional Gradient methods are an important class of methods to minimize (non-)smooth convex functions over (combinatorial) polytopes. Recently these methods received a lot of attention as they allow for structured optimization and hence learning, incorporating the underlying polyhedral structure into solutions. In this talk I will give a broad overview of these methods, their applications, as well as present some recent results both in traditional optimization and learning as well as in deep learning.
We study the loss landscape of two-layer mildly overparameterized ReLU neural networks on a generic finite input dataset for the squared error loss. Our approach involves bounding the dimension of the sets of local and global minima using the rank of the Jacobian of the parametrization map. Using results on random binary matrices, we show most activation patterns correspond to parameter regions with no bad differentiable local minima. Furthermore, for one-dimensional input data, we show most activation regions realizable by the network contain a high dimensional set of global minima and no bad local minima. We experimentally confirm these results by finding a phase transition from most regions having full rank to many regions having deficient rank depending on the amount of overparameterization.
This is work with Kedar Karhadkar, Michael Murray, Hanna Tseran.
Bio: Guido Montúfar is an Associate Professor of Mathematics and Statistics & Data Science at UCLA as well as the Leader of the Math Machine Learning Group at the Max Planck Institute for Mathematics in the Sciences. His research focuses on deep learning theory and more generally mathematical aspects of machine learning. He studied mathematics and physics at TU Berlin, obtained the Dr.rer.nat. in 2012 as an IMPRS fellow in Leipzig, and held postdoc positions at PennState and MPI MiS. Guido Montufar is a 2022 Alfred P. Sloan Research Fellow.
https://www.math.ucla.edu/~montufar/
Random walks on discrete settings is one of the most powerful mathematical tools for solving a wide range of theoretical and applied problems in discrete math, probability, theoretical computer science, data science, and machine learning. This ranges from google page rank algorithm to clustering elements in complex networks and detecting topological communities in high dimensional data. In this talk, I will show how random walks can serve as the bridge between quantitative features of data, which are determined by geometric tools such as curvature, and qualitative features that can be detected by topological methods. In traditional data analysis, graphs as the simplest examples of discrete structures are well studied and we have a relatively broad understanding of their topological, geometric and stochastic properties. However, graphs can not present higher order interactions among data points and a main challenge in both theoretical and applied sides is to extend results from graphs to more complicated structures which are suitable for modelling beyond binary interactions. After presenting some of the main ideas of geometric and topological data analysis via random walks on graphs, I will talk about the main challenges of extending some graph-results to higher dimensions and how we can go further. I try to present mostly the intuition behind the ideas and will not go into all the details. Therefore I hope the content would be accessible to those who have not much background in geometric or topological methods. Knowing about Markov chains and stochastic processes would be helpful.
Graph neural networks have two different data modes - the feature vectors and the graph structure. In this talk, we explore the interaction between hyperbolic geometry and the two different data formats.
For the feature vectors, we explore the effect of geometry on the performance of three types of GNNs for node classification and link prediction. To do so, we propose a family of mixed geometry GNNs with Hyperbolic and Euclidean components that can be trained jointly. We compare the performance of our mixed geometry models against their Euclidean and Hyperbolic counterparts across various datasets. We see that the mixed geometry models have the best performance for node classification, while the Hyperbolic models have the best performance for link prediction. Further, we see that for node classification, the choice of architecture had a more significant impact on the performance than the choice of geometry. Whereas for link prediction, the choice of geometry had a much more significant impact than the choice of architecture.
For the graph structure, we study the problem of oversquashing. We use ideas from geometric group theory to present RelWire, a rewiring technique based on the geometry of the graph. We derive topological connections for RelWire. We then rewire different real-world molecule datasets and show that RelWire is Pareto optimal: it has the best balance between improvement in eigengap and commute times and minimizing changes in the topology of the underlying graph.
The set of functions parameterized by a linear fully-connected neural network is a determinantal variety. We investigate the subvariety of functions that are equivariant or invariant under the action of a permutation group. Examples of such group actions are translations or 90◦ rotations on images. For such equivariant or invariant subvarieties, we provide an explicit description of their dimension, their degree as well as their Euclidean distance degree, and their singularities. We fully characterize invariance for arbitrary permutation groups, and equivariance for cyclic groups. We draw conclusions for the parameterization and the design of equivariant and invariant linear networks, such as a weight sharing property, and we prove that all invariant linear functions can be learned by linear autoencoders.
This talk is based on joint work with Anna-Laura Sattelberger and Vahid Shahverdi.
Spatial trajectories, often represented as a sequence of spatial positions, are a standard way to represent human mobility patterns. They also are used to represent motion patterns including for animals, drones, or last-mile rentals (e-scooters). However, these trajectories are notoriously difficult to work with as they overlap and can stretch long distances.
In this talk we discuss a new sketch (the minDist Sketch) that makes just about any data analysis on trajectories tasks simple and efficient. This first considers spatial trajectories as an abstract shape, and then maps them to a high-dimensional Euclidean space as a vector. We can show recovery, pseudo-metric, and metric properties of this representation. Variants can include direction information, or traits like velocity and acceleration.
Moreover, once represented as this vector, the trajectory data is extremely easy to work with. Allowing for out-of-the-box use of software for nearest-neighbor search, clustering, and classification. In particular, we conduct the first formal study of classifying spatial trajectories: given trajectories from two different distributions (e.g., generated by car or bus) given a new trajectory that is unlabeled, how well can we predict which class it was from?
Over several data sets we have assembled that demand this task, we conduct a large study, and show that the minDist sketch and its variants are consistently the easiest and most accurate method (or at the least among the best in each instance).
Joint work with Pingfan Tang and Hasan Pourmahmood
Ideally, the distribution of the latent representations within any neural network should depend only on the task, the data, the loss, and other architecture-specific constraints. However, factors such as the random weights initialization, hyperparameters, or other sources of randomness may induce incoherent latent spaces that hinder any form of reuse. In this talk I'll report an important (and somewhat surprising) empirical observation: under the same data and modeling choices, distinct latent spaces typically differ by an unknown quasi-isometric transformation; that is, in each space, the distances between the encodings do not change. I'll then show how simply adopting pairwise similarities as an alternative data representation leads to guaranteed isometry invariance of the latent spaces, effectively enabling latent space communication: from zero-shot model stitching to latent space comparison between diverse settings. Several validation experiments will follow on different datasets, spanning various modalities (images, text, graphs), tasks (e.g., classification, reconstruction) and architectures (e.g., CNNs, GCNs, transformers).
Motivated by the developing mathematics of deep learning, we build universal functions approximators of continuous maps between arbitrary Polish metric spaces X and Y using elementary functions between Euclidean spaces as building blocks. Earlier results assume that the output space Y is a topological vector space. We overcome this limitation by “randomization”: our approximators output discrete probability measures over Y. When X and Y are Polish without additional structure, we prove very general qualitative guarantees; when they have suitable combinatorial structure, we prove quantitative guarantees for Hölder-like maps, including maps between finite graphs, solution operators to rough differential equations between certain Carnot groups, and continuous non-linear operators between Banach spaces arising in inverse problems. In particular, we show that the required number of Dirac measures is determined by the combinatorial structure of X and Y. For barycentric Y, including Banach spaces, R-trees, Hadamard manifolds, or Wasserstein spaces on Polish metric spaces, our approximators reduce to Y-valued functions. When the Euclidean approximators are neural networks, our constructions generalize transformer networks, providing a new probabilistic viewpoint of geometric deep learning.
The Structure from Motion (SfM) problem aims to create a tridimensional model of a scene from 2D images. In order to do so, point and line features are identified and matched in the images, creating correspondences that are triangulated and used to recreate the scene. In this talk we will focus on the triangulation of lines. We will define the line multiview varieties and present their main properties. Then we will introduce the Anchored Multiview varieties which will be a starting point for the triangulation of line and point incidences appearing frequently in line segment triangulation.
This talk will explain that Convolutional Neural Networks without activation parametrize semialgebraic sets of real homogeneous polynomials that admit a certain sparse factorization. We will investigate how the geometry of these semialgebraic sets (e.g., their singularities and relative boundary) changes with the network architecture. Moreover, we will explore how these geometric properties affect the optimization of a loss function for given training data. This talk is based on joint work with Kathlén Kohn, Guido Montúfar, and Matthew Trager.
This tutorial provides a practical demonstration of how numerical methods can enhance the comprehension of (algebraic) geometric problems and aid in proving statements. We begin with a brief overview of numerical homotopy continuation methods and the process of certifying solutions with interval methods.
Utilizing the software HomotopyContinuation.jl, we delve into practical applications, showcasing the techniques through a series of examples.
Participants
Asma Abid
Government College University
Sophie Achard
CNRS, Univ. Grenoble Alpes
Sharanya Achut
Georg-August-Universität Göttingen
Usman Ali
Bahauddin Zakariya University
Shivam Arora
Memorial University of Newfoundland
George Balla
TU Berlin
Jacob Bamberger
University of Oxford
Federico Barbero
University of Oxford
Abhishek Behera
MPI-CBG/ CSBD
Maxim Beketov
HSE University
Erik Bekkers
University of Amsterdam
Jonas Beyrer
University of Bonn
Alessio Borzì
MPI MIS Leipzig
Marie-Charlotte Brandenburg
MPI MiS Leipzig
Pierre Bréchet
MPI MiS
Paul Breiding
University of Osnabrueck
Sasha Brenner
MPI CBS
Sven Buchholz
TH Brandenburg
Evgeny Burnaev
Skoltech
Fernando Camacho Cadena
MPI MiS Leipzig
Jorge Eduardo Cardona Gaviria
Friedrich Schiller University Jena
Oscar Carlsson
Chalmers University of Technology
Jonas Cassel
Heidelberg University
Karel Devriendt
MPI MiS Leipzig
Valentina Disarlo
Heidelberg University
Nicholas Early
MPI MiS Leipzig
Marzieh Eidi
MPI MiS Leipzig/ SCaDs. AI
Yara Elshiaty
Heidelberg Univeristy
Fatemeh Fahimi
MPI MiS Leipzig
Samantha Fairchild
MPI MiS Leipzig
Ines Garcia Redondo
Imperial College London
Marina Garrote-López
MPI MiS Leipzig
Andre Gomes
Universtiy of Campinas
Daniel Gonzalez
Heidelberg University
Pablo Gottheil
Leipzig University
Vincent Grande
RWTH Aachen University
Heather Harrington
MPI-CBG
Michael Hecht
CASUS/HZDR and University Wroclaw
Shah Hussain
Graz University of Technology
Mitul Islam
MPI MiS Leipzig
Shamoona Jabeen
University of Science & Technology Bannu
Saiyam Jain
KIT
Abel Jansma
MPI MiS Leipzig
Stefanie Jegelka
Massachusetts Institute of Technology
Parvaneh Joharinad
ScaDS.AI Dresden/ MPI MiS Leipzig
Jürgen Jost
MPI MIS Leipzig
Oleg Kachan
HSE University
Leonie Kayser
MPI MIS Leipzig
Janis Keck
MPI MIS Leipzig
Martin Keller-Ressel
TU Dresden
Yash Kesharwani
Savitribai Phule Pune University
Dimitra Kiakou
MPI CBS Leipzig
Sayako Kodera
Fraunhofer Institute for Nondestructive Testing IZFP
Joris Koefler
University of Warwick
Constantin Kogler
University of Oxford
Kathlén Kohn
KTH Stockholm
Fabius Krämer
Rheinische Friedrich-Wilhelms-Universität Bonn
Anastasis Kratsios
McMaster and Vector Institute
Oxana Krymina
VSK (Insurance company)
Deniz Kucukahmetler
Imperial College London
Olga Kuznetsova
Aalto University
Fabian Lander
University of Heidelberg
Gennady Laptev
Skoltech
Kolya Lettl
Leipzig University
Jiayi Li
University of California, Los Angeles
Ruowei Li
MPI MIS Leipzig
Felix Lotter
University of Bonn
Kelly Maggs
EPFL
Marta Magnani
University of Heidelberg
Filippo Mazzoli
MPI MIS Leipzig
Madhumita Mondal
The Institute of Mathematical Sciences
Guido Montúfar
MPI MIS Leipzig/ UCLA
Paula Mühlmeyer
Carl von Ossietzky Universität Oldenburg
Sayan Mukherjee
ScaDS.AI/ MPI MiS Leipzig
Johannes Müller
RWTH Aachen
Merik Niemeyer
MPI MiS Leipzig
Elias Nyholm
Chalmers University of Technology / University of Gothenburg
John Ochen
Makerere University
Osman Okutan
MPI MiS Leipzig
Eckehard Olbrich
MPI MiS Leipzig
ANGÉLICA OLIVEIRA
UNICAMP
Arsenii Onuchin
Skoltech
Milka Anyango Otieno
Kenya Commercial Bank/Chuka University
Apondi Otieno
Technical University of Kenya
Maks Ovsjanikov
Ecole Polytechnique
Guendalina Palmirotta
University of Luxembourg
Niharika Paul
MPI MiS Leipzig
Eduardo Jose Perez Mejia
Fraunhofer-Institut für Zerstörungsfreie Prüfverfahren IZFP
Bei Wang Phillips
University of Utah
Jeff Phillips
University of Utah
Sebastian Pokutta
Zuse Institute Berlin
Renata Possobon
Unicamp/ MPI MiS Leipzig
Naeem Ahmad Pundeer
Jadavpur University, Kolkata, India
Kristian Ranestad
University of Oslo
Bernhard Reinke
MPI MiS Leipzig
Guillermo Restrepo
MPI MiS Leipzig
Emanuele Rodolà
Sapienza University of Rome
Kilian Roth
Leipzig University
Roua Rouatbi
TU Dresden
Leonel Rozo
Bosch Center for Artificial Intelligence
Elfat Sabitov
NRU HSE
Nico Scherf
MPI CBS Leipzig
Ilya Schurov
Radboud University
Vahid Shahverdi
KTH Royal Institute of Technology
Ehsan Shams
Alexandria University
Elima Shehu
Osnabrück University & MPI MiS
Jiajun Shi
MPI MiS Leipzig
Dmitrij Sitenko
University of Heidelberg
Mireille Soergel
MPI MiS Leipzig
Linus Sommer
Leipzig University
Rishi Sonthalia
UCLA
Daniel Speckhard
FHI-Berlin der MPG, HU-Berlin, MPI-FKF
Daniel Spitz
MPI MiS Leipzig
Simone Steinbrüchel
Leipzig University
Jan Stühmer
Heidelberg Institute for Theoretical Studies
Jan Stühmer
Heidelberg Institute for Theoretical Studies / Karlsruhe Institute of Technology
Dominik Sturm
TU Dresden
Bernd Sturmfels
MPI MiS Leipzig
Juan Esteban Suarez
TU Dresden
Diaaeldin Taha
MPI MiS Leipzig
Roan Talbut
Imperial College London
Sascha Timme
TU Berlin
Francesca Tombari
MPI MiS Leipzig
Nahid Torbati
MPI CBS Leipzig
Angelica Torres
MPI MiS Leipzig
Anton Ullrich
MPI MiS Leipzig
Vahideh Vahidifar
University of Sussex
Jörg Walter
Leipzig University
Qingsong Wang
University of Utah
Qiquan Wang
Imperial College London
Anna Wienhard
MPI MiS Leipzig
Arne Wolf
Imperial College London
Neza Zager Korenjak
MPI MiS Leipzig
Daniil Zaytsev
Higher School of Economics
ANUM ZEHRA
THE WOMEN UNIVERSITY MULTAN
Wei Zhao
Heidelberg Institute for Theoretical Studies
Giulio Zucal
MPI MiS Leipzig
Scientific Organizers
Samantha Fairchild
Max Planck Institute for Mathematics in the Sciences
Diaaeldin Taha
Max Planck Institute for Mathematics in the Sciences
Anna Wienhard
Max Planck Institute for Mathematics in the Sciences
Administrative Contact
Katharina Matschke
Max Planck Institute for Mathematics in the Sciences
Contact by email