The use of geometry to learn from data, and the learning of geometry from data.

  • Nicolas Garcia Trillos (Department of Statistics, University of Wisconsin-Madison, USA)
E1 05 (Leibniz-Saal)


In this talk I will explore from different perspectives the duality between geometry and data, and mostly focus on two of them. First, ideas from geometry and analysis have inspired a variety of popular procedures for supervised and semi supervised learning tasks, an example of which is the so called spectral clustering algorithm, where for a given data set $\mathit{X}=\{x_1, \dots, x_n\}$ with a weighted graph structure $\Gamma= (\mathit{X},W)$ on $\mathit{X}$, one uses the spectrum of an associated graph Laplacian to find a meaningful partition of the data set. Conversely, one can ask whether such ML algorithms are more concretely linked with the geometric ideas that inspired them: If the data set consists of samples from a distribution supported on a manifold (or at least approximately so), and the weights depend inversely on the distance between the points, do these procedures converge as the number of samples goes to infinity towards analogous procedures at the continuum level?. I will explore this question mathematically using ideas from the calculus of variations, PDE theory, and optimal transport. Then, I will discuss a second perspective: what is the impact of the "choice of geometry" in learning, and how, in the presence of labeled data, this can be related to the inverse problem of learning geometry from data?

24.04.18 19.03.21

Mathematics of Data Seminar

MPI for Mathematics in the Sciences Live Stream

Katharina Matschke

MPI for Mathematics in the Sciences Contact via Mail