**Musterbildung, Energielandschaften und Skalierungsgesetze**

**Direktor: **

Felix Otto

**Kontakt: **Email

*Telefon:*

+49 (0) 341 - 9959 - 950

*Fax:*

+49 (0) 341 - 9959 - 658

**Anschrift:**

Inselstr. 22

04103 Leipzig

**Sekretariat:**

Katja Heid

Email, *Telefon:*

+49 (0) 341 - 9959 - 951

## Optimal transportation

Optimal transportation describes the problem of finding the most efficient way to move distribution of materials $\mu$ to the consumers $\lambda$, given that to move one unit of material from $x$ to $y$ costs $c(x-y)$. Since its first formulation by Monge in 1781, this problem has always been driven by applications. In fact, already its formulation was motivated by the problem of optimal movement of earth in road construction. Recently, the problem has generated additional interest due to applications in machine learning and AI. However, optimal transportation is also interesting from a mathematical perspective, taking ideas from PDE theory, probability and other fields. Questions of existence, regularity and convergence of approximations are of interest, although in our group we focus on quantitative approaches to the latter two areas.

## ... and optimal matching

A common problem in these applications is to have two sets of data points $\{x_i\}$ and $\{y_i\}$ and to seek a matching $T\colon\{x_i\}\to \{y_i\}$, that we think as a bijection between the two point clouds, that minimizes some total cost. The easiest instance of the matching problem is its Euclidean version. Given two point clouds $\{x_i\}_{i=1}^n, \{y_i\}_{i=1}^n \subset [0,1]^d$ of uniformly independently distributed points we are interested in the map $T$ that minimizes the total transportation cost, that is \begin{equation}\label{eq:matchingdef} \sum_{i=1}^n |x_i - T(x_i)|^2 := \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |x_i - y_{\sigma(i)}|^2. \end{equation} An illustration of such a matching, where both $\{x_i\}$ and $\{y_i\}$ are distributed uniformly in a square, is given in Figure 1. By a simple heuristic argument [21], we expect that, given a point $x_i$ from the first point cloud, we can find a point $y_j$ (for some $j$) within a volume of order $O(n^{-1})$. For this reason the typical inter-point distance is of order $O(n^{-\frac1d})$, which suggests the scaling of \eqref{eq:matchingdef} being of order $O(n^{-\frac2d})$. Unfortunately, this heuristic turns out to fail in low dimension. Indeed, the asymptotics of \eqref{eq:matchingdef} are given by \begin{equation}\label{eq:unifmatchingrates} \mathbb{E}[W_2^2 (\mu^n, \nu^n)]\sim n^{-\frac2d}\cdot\begin{cases} n & \text{if $d=1$,} \\ {\ln n} & \text{if $d=2$}, \\ 1 & \text{if $d\ge 3$.} \end{cases} \end{equation} As shown in \eqref{eq:unifmatchingrates} fluctuations are effective in low dimension, exhibiting a logarithmic correction when $d=2$. The critical dimension $2$ was first understood in the seminal paper [2].

Figure 1: An optimal matching between two point clouds

## Optimal matchings in 2d

Note that the optimal matching problem as described in \eqref{eq:matchingdef} can be seen as a special case of optimal transportation problems where the original and final distributions are point measures, that is they are discrete measures. Indeed, defining $\mu^n \colon=1/n\sum_{i=1}^n \delta_{X_i}, \nu^n \colon= 1/n\sum_{i=1}^n \delta_{Y_i}$, by the Birkhoff's von Neumann Theorem it holds \begin{equation} W_2^2(\mu^n, \nu^n) = \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |x_i - y_{\sigma(i)}|^2. \end{equation}

This allows to study \eqref{eq:matchingdef} from an analytical point of view exploiting the rich structure of optimal transport combining tools from convex analysis and partial differential equations. In [7] a very precise ansatz regarding convergence of the re-scaled cost in dimension $2$ was made. The main idea is an approximation of the optimal transport map by a first order linearization of the Monge–Ampère equation. This approach was formally justified in [3] at the level of the matching cost and later at the level of the transport map in [4]. This new analytical approach had a big impact on the study of the matching problems and stimulated high activity. For instance it has been shown that similar techniques can be applied to the study of more general densities [5], to the study of point clouds constituted by weakly dependent points [6] and by the eigenvalues of a random Hermitian matrix [18], as well as to the study of continuous matching, i.~e. when the empirical measure is substituted by the occupation measure of a stochastic process [20,26]. More interestingly, optimal transport techniques have been recently applied to a larger class of discrete optimization problems such as the traveling salesperson problem [17].

So far, we have focused on the matching problem on large scales. Thus we are interested in matching infinite point clouds in the whole space $\mathbb{R}^d$. The natural counterpart to uniformly independently distributed points is given by the Poisson point process. Identifying such a point process with the locally finite discrete measure $\mu \colon= \sum \delta_X$, given two Poisson point clouds $\{X\}, \{Y\}$ we are interested in their matching $T: \{X\} \to \{Y\}$. When considering the optimal matching problem with point clouds in $\mathbb{R}^d$ that are distributed according to the Poisson point process, it is possible to see that the expected cost is infinite. This is due to the fact that the expected distance of a point $x_i$ to a point in $\{y_i\}$ is positive. Thus, it is natural to look for solutions that are `locally' optimal, that is they are optimal when restricted to a finite subset of points. This property is known as cyclically monotonicity, since we can see it as saying that under a cyclical re-matching of a finite number of points, the cost increases. Moreover, the Poisson point process is invariant under translations. Hence, we expect the optimal matching to retain this invariance. This property is known as stationarity. However, for the quadratic cost $|x-y|^2$, cyclically monotone stationary matchings do not exist [19]. Moreover, it has been shown in [15] that the displacement is closed to a Gaussian free field at all mesoscopic scales justifying the ansatz of [7] on large scales.

## Harmonic approximation

The proof of the non-existence of stationary, locally optimal matchings in 2d is based on a tool, which is called harmonic approximation. It says that optimal transportation maps are well-approximated by harmonic maps.

The harmonic approximation result was originally developed in [13], in order to study the regularity properties of solutions to the quadratic optimal transportation problem. These problems had been studied before mainly using techniques from nonlinear PDE theory developed by Caffarelli. The outcome of these studies was a partial $C^{1,\alpha}$-regularity theory for the transportation map, under the assumption that $\mu$ and $\lambda$ are compactly supported measures with $C^{1,\alpha}$-densities [10]. In [13] this result is re-proven using a variational approach.

This approach relies primarily on the connection of the optimal transportation problem with a lineari\ed PDE problem. Note that we can view an optimal transportation map $T$ as a change of variables. Thus, at least formally we should expect the change of variables formula to hold, that is $$ \lambda(T(x))\textup{Det}\,(\nabla T(x)) = \mu(x). $$ It turns out that not only does the change of variables formula hold, but $T$ satisfies a further structural restriction- $T(x)-x$ must be the gradient of a convex map! Re-writing the change of variables formula in terms of this convex function $\phi$, we obtain $$ \lambda(\nabla\phi(x)) = \textup{Det}\,(\textup{Id}+\nabla^2\phi(x)) = \mu(x). $$ The idea of the harmonic approximation is to perform a Taylor-expansion for $\textup{Det}$ around the identity. Since $\textup{Det}(\textup{Id}+A)\sim 1+ \textup{tr}\, A$, if we assume in addition that $\mu\sim\lambda\sim 1$, this gives at first order the equation $$ -\Delta\tilde \phi = \mu-\lambda. $$ If this argument is not just formal, but can be made rigorous, we should expect that $\tilde \phi\approx \phi$. This is the outcome of the harmonic approximation result in [13]. A quantified version of this outcome can be found in [14]. To illustrate this connection we compare the averaged dislocation vector of an optimal matching (red) to the averaged gradient of a harmonic map, which is constructed according to the harmonic approximation theory, in Figure 2. This linearization approach is also related to Moser transport, c.f. [25, Chapter 1], as well as the linearization of Wasserstein distance to $\dot{H}^{-1}$ [24, Chapter 5.2.2].

Figure 2: Comparing optimal matching and harmonic gradient

Regularity results for continuous densities were obtained in [12], while the boundary case was treated in [22]

## More general cost funtions

The aforementioned regularity results can be extended to cost functions that locally look like the Euclidean cost function $|y-x|^2$, e.g. the geodesic distance squared on Riemannian manifolds, by appealing to the concept of almost-minimizers of the optimal transport problem [23]. This provides a variational counterpart to the partial regularity result proved in [9], and shows the flexibility of the approach based on harmonic approximation. In fact, this approach is purely variational in character (analogous to De Giorgi's variational regularity theory for minimal surfaces), completely circumventing the use of the Monge–Ampère equation. For even more general cost functions, such as $p$-cost $|x-y|^p$, many questions remain open.

## Multi-marginal

A natural generalization of the optimal transport problem is the question of optimally coupling more than two measures, leading to the so-called multi-marginal optimal transport problem. Multi-marginal optimal transport, originally considered by Gangbo and Święch [11], has recently received attention due to its connection to barycenters of measures with respect to the Wasserstein geometry [1], which provides a natural way of interpolating many probability measures. However, the regularity properties of this interpolating density is far from being well-understood. If the cost function is given by pairwise Coulomb interaction, the multi-marginal optimal transport problem also arises as a semi-classical limit (in the strong coupling regime) of electronic density functional theory [8]. Due to the repulsive nature of the cost, the behavior of optimal transport in this case is rather different from the usually considered cost functions and many questions remain open.