The TAGS workshop will bring together researchers in topological data analysis, applied algebraic geometry, statistics, and related fields. There will be a number of introductory talks on each of these topics, with the aim of attracting participants who are not necessarily experts in these fields. We will provide a platform for discussing areas of common interest, methodologies extending between fields, and currently open problems.
Travel funding and accommodation can be provided for early-career participants such as postdoctoral researchers and PhD students. Applicants are expected to write a short letter of motivation and to kindly agree to present their work in the form of a poster. PhD students will further be asked to provide a letter of recommendation from their supervisors. Further instructions on the funding application follow in the confirmation of registration. The deadline for applications is December 1, 2017.
Persistent homology, the homology of a filtration of topological spaces, is described up to isomorphism by the persistence diagram (barcode), which encodes the structure of its indecomposable summands. The stability theorem, a fundamental theorem in the theory of persistence, states that small changes in the input data lead only to small changes in the persistence barcode. We will explore a constructive algebraic approach to this result that admits a high level of generality.
Stability is certainly one most important aspect of persistence and there are numerous results. Most of the results are in terms of bottleneck distance between persistence diagrams. While applicable in general settings, it is a sup-norm, and using it to prove convergence can be difficult. In this talk, I will discuss some new results concentrating on a bound on the p-Wasserstein distance between persistence diagrams. The main result states that p-norm between two functions on a simplicial complex is an upper bound on the p-Wasserstein distance between the corresponding persistence diagrams. I will discuss some other related results as well as applications.
It is widely known that multidimensional persistent homology is "hard". But what does this really mean? In the first part of the talk I will discuss this from the point of view of representation theory of quivers. Next I will show how multidimensional clustering gives rise to multidimensional persistence modules of a particularly appealing form. Given their simplicity, it is natural to wonder if we are able to carry out persistence computations like in 1-D in the setting of clustering. I will discuss recent work with Ulrich Bauer, Steffen Oppermann and Johan Steen, which shows that this restriction alleviates the aforementioned "hardness" in only very special cases. The last part of the talk will concern how much of this complexity we see in persistence modules coming from data.
Given a locally finite set X in R^d and a positive radius, the k-fold cover of X and r consists of all points that have k or more points of X within distance r. The order-k Voronoi diagram decomposes the k-fold cover into convex regions, and we use the nerve of this decomposition to compute homology and persistence. In particular the persistence in depth is interesting from a geometric as well as algorithmic viewpoint. The main tool in understanding its structure is a rhomboid tiling in R^{d+1} that combines the nerves for all values of k into one. We mention a straightforward consequence, namely that the cells in the nerve are generically not simplicial, unless k=1 or d=1,2.
(Joint work with Georg Osang.)
We will extend persistent (co-)homology by considering diagrams of incidence algebras of finite posets. By using results of Gerstenhaber et al. we will show what is the link between diagram cohomology, Hochschild cohomology, and multidimensional persistence connecting seemingly uncorrelated research areas as topological data analysis, noncommutative geometry and topological field theory (via Cattaneo - Mnev - Reshetikhin work on cellular field theory).
This talk will address various aspects of TDA from a more statistical perspective. It will include a brief survey of relevant approaches for statistical inference, and a critical discussion of some challenges involved in (i) the development of approaches for statistical inference; (ii) the derivation of statistically meaningful theory; and (iii) more practical challenges arising in scientific applications of TDA methods.
In this talk I will present some thoughts on challenges of inferring topological features from noisy data in the perhaps most simple setting: estimating the number of modes (i.e., local maxima) of a 1D signal. Already this simple setup illustrates some of the statistical obstacles when trying to infer topological information. A particular focus will be on inference without presmoothing the data. In particular, I will discuss some aspects of persistent topology in the 1D setting and present a somewhat alternative approach based on the Kolmogorov instead of the sup norm. I will highlight some of the advantages and disadvantages of these approaches from a statistical perspective.
A wide range of statistical applications rely on the central limit theorem (CLT) which assures for random vectors, upon existence of second moments, asymptotic square-root-n-normality of the mean. For deviates taking values in a Riemannian manifold, the celebrated CLT of Bhattacharya and Patrangenaru (2005) extends this result to Fréchet means of the intrinsic distance, under a collection of additional assumptions.
In the sequel the Battacharya-Patrangenaru CLT has been extended to Fréchet means of more general distances, e.g. Procrustes distances on shape spaces to "distances" between data and a data-descriptor, a geodesic say, and even to nested sequences of random descriptors, such as principal nested spheres.
In particular, the latter is a generalization of the classical asymptotics of principal components by Anderson (1963). Still, for all of these general scenarios, analogs of the above mentioned additional assumptions are required to hold true. Meanwhile, the geometric meaning of these assumptions has been exemplary studied: the distribution near the cut locus of the mean may be responsible for rates slower than square-root-n with non-normal limiting distributions, and we call this phenomenon smeariness. Asymptotic nonnormality may also be observed on noncomplete manifolds, even without cut loci. For nonmanifold spaces allowing only for a Riemannian stratification, for example BHV phylogentic tree spaces, "infinitely fast" rates have been observed, making straightforward statistical testing impossible, this phenomenon is called stickiness.
Information Geometry is the differential geometric study of the manifold of probability models, and promises to be a unifying geometric framework for investigating statistical inference, information theory, machine learning, etc. Instead of using metric for measuring distances on such manifolds, these applications often use “divergence functions” for measuring proximity of two points (that do not impose symmetry and triangular inequality), for instance Kullback-Leibler divergence, Bregman divergence, f-divergence, etc. Divergence functions are tied to generalized entropy (for instance, Tsallis entropy, Renyi entropy, phi-entropy) and cross-entropy functions widely used in machine learning and information sciences. It turns out that divergence functions enjoy pleasant geometric properties – they induce what is called “statistical structure” on a manifold M: a Riemannian metric g together with a pair of torsion-free affine connections D, D*, such that D and D* are both Codazzi coupled to g while being conjugate to each other. Divergence functions also induce a natural symplectic structure on the product manifold MxM for which M with statistical structure is a Lagrange submanifold. We recently characterize holomorphicity of D, D* in the (para-)Hermitian setting, and show that statistical structures (with torsion-free D, D*) can be enhanced to Kahler or para-Kahler manifolds. The surprisingly rich geometric structures and properties of a statistical manifold open up the intriguing possibility of geometrizing statistical inference, information, and machine learning in string-theoretic languages.
A simple permutoassociahedronDjordje BaralicMathematical Institute SANUPlease see the abstract as PDF file.Stable Resolutions of Multi-Parameter persistence modulesNicolas BerkoukINRIA SaclayMulti-graded betti numbers, one of the only invariant we have to study multi-parameter persistence modules, are highly not stable with respect to the interleaving distance. However, they arise from the existence of free minimal resolutions and we propose to show how we can equip the derived category of persistent modules (the one in which resolutions « naturally » live) with a derived distance to make those resolutions stable. As a byproduct, we get the non-existence of a right (or left) exact functor from the category of modules to the one that decomposes in thin summands.Topological data analysis to identify blood cellsNello BlaserUniversity of BergenMass cytometry can be used to measure up to forty different parameters for millions of cells. We used topological methods to analyse and visualize these large high-dimensional point clouds and compare the results of traditional methods used in cytometry to the results of topological methods. We found that topological methods could be used to quickly identify large cell populations and that these cell populations agree with the ones found by traditional methods.Sparse CoversMorten BrunUniversity of BergenWe construct sparse approximations of filtered Cech, Rips and Voronoi complexes. The sparse Cech and Rips complexes appeared first in papers of Sheehy and Cavanna-Jahnseir-Sheehy. The sparse Voronoi complex is closely related to witness complexes. These sparse complexes are constructed from an arbitrary finite metric space, and they have a version for every choice of interleaving constant greater than one. The sparse Cech, Rips and Voronoi complexes are subcomplexes of nerves of sparse covers of the underlying finite metric space, and most of out reasoning takes place on the level of such covers.Sparse Dowker NervesMorten BrunUniversity of BergenTopological Change Points for Time-Varying Brain Activity NetworksBen CassidyColumbia UniversityA common problem in the statistical analysis of multivariate time series is to identify and characterise change points where the probability distribution underlying the system of interest shifts from one state to another throughout time. But most existing change point statistics are either univariate measures or a simple function of the empirical covariance matrix, and are not suitable for detecting changes in time-varying networks.
We present new approaches for identifying and characterising change points in time-varying brain activity networks constructed from functional MRI data. The methods are based on persistent homology and build on our recent work for non-stationary network system identification. We demonstrate examples on real and simulated data.Functional Data Analysis using a Topological Summary Statistic: the Smooth Euler Characteristic TransformAndrew ChenColumbia UniversityWe introduce a novel statistic, the smooth Euler characteristic transform (SECT), which is designed to integrate shape information into regression models by representing shapes and surfaces as a collection of curves with little to no loss in information. As a result, for the first time detailed geometric and topological information can be utilized as informative predictor variables in an a large catalogue of functional data analytic methodologies. We illustrate the utility of the SECT in a radiogenomics context by showing that topological summaries of tumors, assayed by magnetic resonance imaging (MRI), are better predictors of clinical outcomes in patients with glioblastoma multiforme (GBM). We show that SECT features outperform previous topological summary statistics, and alone explain more of the variance in patient survival than gene expression, volumetric features, and morphometric features.Multidimensional persistence modulesJérémy CochoyINRIA / Paris-SudPersistence is a tool which allow to recover topological property of a sampled surface from a point cloud. In the construction, an algebraic object appear : the persistence module. The generalisation of this object with index set of higher dimension still present a challenge for their understanding. This poster will present persistence in the multidimensional setting and some of it's properties.Topology-based metric learningOliver GäfvertKTH Royal Institute of TechnologyMetric learning (or similarity learning) is the task of learning a metric that, in some sense, maximally separates two sets of objects. Using a parametrized class of pseudometrics on persistence modules, derived from what we call persistence contours. We define a scheme for learning a pseudometric on persistence modules that maximally (in some sense) separates between two given sets of persistence modules. In the one-dimensional case this means that we're optimizing the parameters of a pseudometric on the space of barcodes in such a way that it ''maximally'' separates two given sets of barcodes. The resulting metric can then be used for classifying unseen barcodes into the two sets.Persistent chain complexesBarbara GiuntiUniversità di PaviaPersistent homology of the Vietoris-Rips complex has proven to be an effective tool to extract geometric information from data sets. Homology, however, is a drastic simplification and in certain situations might remove too much information. This project aims to investigate invariants, extracted not from the homology but rather from the persistent chain complex. For this approach to be useful, a decomposition structure theorem for persistence chain complexes is necessary. That is the crucial fundament theorem of this project. This presentation aims to describe the indecomposable and present an algorithm of how to enumerate indecomposable summands for a persistent chain complex.Geometric comparison of phylogenetic trees on varying taxa setsGillian GrindstaffThe University of Texas at AustinAs discovered by Billera, Holmes, and Vogtmann in 2001, the set of phylogenetic trees on n taxa form a CAT(0) contractible geodesic metric space, referred to as BHV space, which allows for the adaptation of statistical techniques to sets of phylogenies. However, the problem of comparing or combining trees on different sets of taxa has not been possible using the same methods. Davidson et al. recently created a combinatorial algorithm to extend tree topologies to higher dimensional sectors of BHV space. We adapt their algorithm for metric trees to compute the subspace of extensions of a tree on fewer than n taxa to n-dimensional BHV space. We also define several metrics which can be used to measure the compatibility of multiple metric tree fragments, and the degree to which a collection of fragments can be realized as subtrees of the same tree structure.Holonomic extended least angle regressionMarc HarkonenGeorgia Institute of TechnologyModel selection algorithms are used to automatically choose the statistical model or family of models that best fit our data. For example in generalized linear models each parameter corresponds to one covariate, and then one might ask which of these are actually useful, or the most “impactful” in our fitting procedure, and how do we actually decide which covariates to include in the model. There have been numerous attempts at automatizing this process. Most notably, the Least Angle Regression algorithm, or LARS, by Efron et al. is a computationally efficient algorithm that ranks the covariates of a linear model without being too greedy. The LARS algorithm was extended in 2010 by Hirose and Komaki for a class of distributions in the generalized linear model by using properties of the manifold of exponential families as a dually flat manifold. However this extension assumes that the normalizing constant of the joint distribution of observations is “easy” to compute. This is often not the case, for example the normalizing constant may contain an integral that has no analytic solution.
However, if the normalizing constant satisfies a holonomic system, we can circumvent this issue by using a method inspired by the Holonomic Gradient Method from Nakayama et al. In this poster we present a modification of the holonomic gradient method and add it to the extended LARS algorithm. We call this the holonomic extended least angle regression algorithm, or HELARS. The algorithm was implemented using the statistical software R, and was tested with real and simulated datasets.Matroid Structure in Computational TopologyGregory Henselman-PetrusekPrinceton UniversityThe proposed poster would highlight recent developments in matroid theory, homological algebra, and computation, with a view to statistical topology. Several of the premier applications in this field rest on the decomposition of homological persistence modules into 1-dimensional factors that realize, collectively, the structure of a finitary matroid. The combinatorial properties of this object have profound consequences in statistical topology and homological algebra, with numerous connections to Lie, Hodge, and Gromov-Witten theory. As an elementary application, we present the Eirene library for homological persistence, an open-source platform for persistence module decomposition that outperforms, in several cases of interest, the current state-of-the-art by orders of magnitude in both time and memory.Model Selection by Algebraic CriteriaShaoxiong HuQueen Mary University of LondonPlease see the abstract as PDF file.Statistical tools for persistence diagrams via smoothed Optimal TransportThéo LacombeInria SaclayStandard metrics on persistence diagrams (Bottleneck or Wasserstein) provide a powerful theoretical framework to compare topological descriptors. However, their combinatorial nature makes them hard to handle from a statistical perspective. By leveraging recent progress in optimal transport theory, we provide an approximation of these metrics which presents interesting properties for statistical and numerical purposes. We show applications to the so-called Wasserstein barycenter problem for persistence diagrams and other numerical problems.Activity-based Graphical Models for Time SeriesZhangsheng LaiSingapore University of Design and TechnologyGiven a directed graph, possibly with cycles and self-loops, we define a continuous-time Markov chain where the nodes are binary random variables and where the off-diagonal entries of the transition rate matrix are defined by monomials in the model parameters. We call this Markov chain a McCulloch-Pitts process, and its defining monomials are known as activities. Such processes play an important role in biologically plausible deep learning, because they explain an experimentally-observed phenomenon in mammalian brains known as Spike Timing Dependent Plasticity. The learning rules for the model are simple, local and distributed, so we may finally be able to train massive neural networks. The model is a toric variety in the space of stochastic rate matrices, and learning corresponds to movement along this toric variety to a point that is closest to the empirical distribution by means of multiplicative weight updates. This is joint work with Chris Hillar and Shaowei Lin.Sheaves on configuration spacesJānis LazovskisUniversity of Illinois at ChicagoCombining all the configuration spaces of a manifold gives the Ran space, which has a natural topology on it. There exists a constructible sheaf on this space, which can be described explicitly or by using a duality between such sheaves and functors of exit paths. Further, taking the product of the Ran space with non-zero real numbers gives a sheaf that classifies the Cech (or Vietoris-Rips) complexes that come from point samples of the base manifold. In other words, this sheaf contains the same information as the persistent homology of all point clouds on the manifold.Simplicial networks and effective resistanceKang-Ju LeeSeoul National UniversityWe introduce the notion of effective resistance for a \emph{simplicial network} $(X,R)$ where $X$ is a simplicial complex and $R$ is a set of resistances for the top simplices, and prove two formulas generalizing previous results concerning effective resistance for resistor networks. Our approach, based on combinatorial Hodge theory, is to assign a unique harmonic class to a \emph{current generator} $\sigma$, an extra top-dimensional simplex to be attached to $X$. We will show that the harmonic class gives rise to the \emph{current} $I_{\sigma}$ and the \emph{voltage} $V_{\sigma}$ for $X\cup\sigma$, satisfying Thompson's energy-minimizing principle and Ohm's law for simplicial networks. The effective resistance $R_{\sigma}$ of a current generator $\sigma$ shall be defined as a ratio of the $\sigma$-components of $V_{\sigma}$ and $I_{\sigma}$. By introducing \emph{potential} for voltage vectors, we present a formula for $R_\sigma$ via the inverse of the weighted combinatorial Laplacian of $X$ in codimension one. We also derive a formula for $R_{\sigma}$ via weighted high-dimensional tree-numbers for $X$, providing a combinatorial interpretation for $R_{\sigma}$. As an application, we generalize Foster's Theorem, and discuss various high-dimensional examples. This is a joint work with Woong Kook.Analysis of contagion maps on a class of networks that are spatially embedded in a torusBarbara MahlerUniversity of OxfordThe spreading of real-world contagions is often guided by the geometry of the underlying domain. One example are contagious diseases that historically spread gradually along part of the earth's surface.
In this case contagions propagate as a wave front, passing between geometrically close entities. However, in today's world with modern transport and communication technology there are many scenarios where, even in the presence of a well-defined underlying geometry, such as the spherical surface of the earth, a contagion spreads via connections that are not intrinsically geometric.
Examples of such scenarios include the spreading of an infectious disease via passengers on a long-distance flight, or the dissemination of information via internet communication. Here the contagion jumps across space to distant locations rather that strictly following the geometry of the underlying domain.
Following the methodology introduced in [1], we study such phenomena by considering a threshold contagion model on a family of networks that can be construed as being embedded in a torus and have both `geometric edges' that respect the geometry of the underlying torus, and `non-geometric' edges that are not constrained by the underlying geometry and can connect nodes across a long range. We investigate what propagation pattern a contagion follows and how much this pattern is influenced by the structure of the underlying space.
[1] D. Taylor et al, “Topological data analysis of contagion maps for examining spreading processes on networks,” Nature communications, vol. 6, 2015.Simplifying a simplicial tower for efficient computation of Persistent HomologySiddharth PritamInria, Sophia Antipolis FranceWe present a novel approach to reduce the size of a simplicial tower ( simplicial complexes connected through general simplicial maps) whille preserving their persistent homology. This technique is easily extendable in the case of Zigzag persistence as well.Optimal Bandwidth for Nonparametric Estimation of Density Level SetsWanli QiaoGeorge Mason UniversityBandwidth selection is crucial in the kernel estimation of density level sets. Risk based on the symmetric difference between the estimated and true level sets is usually used to measure their proximity. In this paper we provide an asymptotic Lp approximation to this risk, where p is characterized by the weight function in the risk. In particular the excess risk corresponds to an L2 type of risk, and is adopted in an optimal bandwidth selection rule for nonparametric level set estimation of d-dimensional density functions (d >= 1).Persistence contours for quantifying topological difference in 1D persistent homologyHenri RiihimäkiTampere University of Technology + KTH StockholmDifferent studies applying persistent homology in data analysis are giving evidence that topological features in different scales are important for the analysis task at hand. This is in contrast to the traditional view in persistence putting focus on features with high persistence. We develop tools called persistence contours to quantify topology at different scales with different emphasis. This leads to a simple functional invariant for persistence modules. Persistence contours also give general metrics between persistence objects.Barcode Embeddings, Persistence Distortion, and Inverse Problems for Metric GraphsYitzchak SolomonBrown UniversityStable topological invariants are a cornerstone of persistence theory and applied topology, but their discriminative properties are often poorly-understood. In this paper we demonstrate the generic injectivity of a rich homology-based invariant first defined by Dey, Wang, and Shi, which we think of as embedding a metric graph in the barcode space.A notion of homological clusteringGard SpreemannEPFLSpectral clustering of graphs has long been a standard tool. We propose a similar method of clustering simplicial complexes based on the simplicial Laplacian.Asymptotic Equivalence between nonparametric density estimation and Gaussian white noiseStefan SteinUniversity of WarwickFollowing a paper of Nussbaum (1996) I present the key steps in proving asymptotic equivalence in the Le Cam sense between nonparametric density estimation and Gaussian white noise. The poster will also include a short introduction to Le Cam theory from its classical perspective in terms of risk functions as well as from a category theoretical point of view.Cohomology of real moment-angle complexesElizabeth VidaurreUniversity of RochesterCertain subspaces of a product of spaces whose factors are labeled by the vertices of a simplicial complex are referred to as "polyhedral product spaces". Polyhedral products are given by taking the union of subproducts depending on the face category of a fixed simplicial complex on $m$ vertices and a labelled family of $m$ topological pairs. Such polyhedral products are realized by objects studied in combinatorics, commutative algebra and algebraic geometry. There are applications of polyhedral products to topological robotics.Offset hypersurfaces and persistent homology of algebraic varietiesMadeleine WeinsteinUC BerkeleyWe present connections between the true persistent homology of algebraic varieties and their corresponding offset hypersurfaces. By Hardt's theorem, true persistent homology events occur at algebraic numbers. By studying the offset hypersurface, we characterize when true persistent homology events occur. The degree of the offset hypersurface is bounded by the Euclidean Distance Degree of the starting variety, so this way we show a bound on the degree of the algebraic numbers at which the homological events occur.
Data sets often have an inherent geometric structure, revealed by approximate algebraic relations of low degree. We discuss a range of methods for studying such samples. One aim is to enhance the toolkit of manifold learning and applied topology with the power of algebraic geometry. This is an ongoing joint project with Paul Breiding, Sara Kalisnik and Madeleine Weinstein.
In network science, concepts and quantities are often introduced in a rather ad hoc manner. I shall use analogies with another geometric discipline, Riemannian geometry, to develop network analysis tools in a systematic manner.
At the heart of Riemannian geometry are notions of curvature. In particular, in recent decades,fundamental aspects of Ricci curvature have been discovered and explored. In fact, there are two aspects of Ricci curvature that turn out to be useful in metric geometry. One is about the average dispersion of geodesics. The other is about comparing the distance between two points with the transportation cost between their local neighborhoods.
As I shall explain, both are useful for the theoretical investigation of networks and other complexes as well as for the analysis of empirical network data.
I will present how computational algebra can be used to study problems arising in topological data analysis. I will motivate each of these problems using applications. First I will present how we use numerical algebraic geometry to sample points on real algebraic varieties. Time permitting, I will propose interpretable invariants for multi-parameter persistent homology using computational commutative algebra.
Semialgebraic sets (that is subsets of Euclidean spaces defined by polynomial equations and inequalities) come in a wide variety of shapes. This raises the problem of describing a given specimen, from basic features like the number of connected components, to finer ones, like homology.
Following a well-known approach, we compute the homology using approximations by point cloud. We show how to ensure correctness and give complexity bounds in terms of the conditioning of the input equations. A probabilistic analysis shows that outside of a vanishingly small subset of the input space, the complexity is single exponential with respect to the number of variables. This improves upon the previously-known double exponential complexity bounds.
Topological data analysis has been shown to be an innovation in studying genetic reassortment events in RNA viruses --- a key factor in studying viral mutations, such as those in HIV. In this talk, I will show how a sufficient statistic for an important invariant in topological data analysis, persistence barcodes, constructed using tropical geometry allows for classical probability distributions to be assumed on the tropicalized representation of barcodes. This makes a variety of parametric statistical inference methods amenable to barcodes, all while maintaining their initial interpretations. More specificially, exponential family distributions may be assumed and likelihood functions for persistent homology may be constructed. We conceptually demonstrate sufficiency and illustrate its utility in dimensions 0 and 1, with a rigorous parametric statistical analysis of inter- and intra-subtype reassortment events in both HIV and avian influenza. This is joint work with Sara Kalisnik Verovsek, Juan Angel Patino-Galindo and Lorin Crawford, and supported by the US NIH under project number R01GM117591.
The idea of selecting a model via penalizing a log-likelihood type criterion is an old idea in statistics. It goes back to the early seventies with the pioneering works of Mallows and Akaike.
One can find many consistency results in the literature for such criteria. These results rely on limit theorems from probability theory and are asymptotic in the sense that one deals with a given number of models and the number of observations tends to infinity. This setting turns out to be irrelevant for many applications (especially for high-dimensional data). Taking as a central illustrative example the Gaussian white noise framework, we shall show that it is possible to develop a non asymptotic theory for model selection. This theory for model selection deeply relies on the concentration of mesure phenomenon which is known to be of a geometric nature. Conversely some important issues of geometrical inference can be solved by using the model selection approach. We shall illustrate this point by focusing on a specific example: the use of model section for simplicial approximation.
A simple permutoassociahedronDjordje BaralicMathematical Institute SANUPlease see the abstract as PDF file.Stable Resolutions of Multi-Parameter persistence modulesNicolas BerkoukINRIA SaclayMulti-graded betti numbers, one of the only invariant we have to study multi-parameter persistence modules, are highly not stable with respect to the interleaving distance. However, they arise from the existence of free minimal resolutions and we propose to show how we can equip the derived category of persistent modules (the one in which resolutions « naturally » live) with a derived distance to make those resolutions stable. As a byproduct, we get the non-existence of a right (or left) exact functor from the category of modules to the one that decomposes in thin summands.Topological data analysis to identify blood cellsNello BlaserUniversity of BergenMass cytometry can be used to measure up to forty different parameters for millions of cells. We used topological methods to analyse and visualize these large high-dimensional point clouds and compare the results of traditional methods used in cytometry to the results of topological methods. We found that topological methods could be used to quickly identify large cell populations and that these cell populations agree with the ones found by traditional methods.Sparse CoversMorten BrunUniversity of BergenWe construct sparse approximations of filtered Cech, Rips and Voronoi complexes. The sparse Cech and Rips complexes appeared first in papers of Sheehy and Cavanna-Jahnseir-Sheehy. The sparse Voronoi complex is closely related to witness complexes. These sparse complexes are constructed from an arbitrary finite metric space, and they have a version for every choice of interleaving constant greater than one. The sparse Cech, Rips and Voronoi complexes are subcomplexes of nerves of sparse covers of the underlying finite metric space, and most of out reasoning takes place on the level of such covers.Sparse Dowker NervesMorten BrunUniversity of BergenTopological Change Points for Time-Varying Brain Activity NetworksBen CassidyColumbia UniversityA common problem in the statistical analysis of multivariate time series is to identify and characterise change points where the probability distribution underlying the system of interest shifts from one state to another throughout time. But most existing change point statistics are either univariate measures or a simple function of the empirical covariance matrix, and are not suitable for detecting changes in time-varying networks.
We present new approaches for identifying and characterising change points in time-varying brain activity networks constructed from functional MRI data. The methods are based on persistent homology and build on our recent work for non-stationary network system identification. We demonstrate examples on real and simulated data.Functional Data Analysis using a Topological Summary Statistic: the Smooth Euler Characteristic TransformAndrew ChenColumbia UniversityWe introduce a novel statistic, the smooth Euler characteristic transform (SECT), which is designed to integrate shape information into regression models by representing shapes and surfaces as a collection of curves with little to no loss in information. As a result, for the first time detailed geometric and topological information can be utilized as informative predictor variables in an a large catalogue of functional data analytic methodologies. We illustrate the utility of the SECT in a radiogenomics context by showing that topological summaries of tumors, assayed by magnetic resonance imaging (MRI), are better predictors of clinical outcomes in patients with glioblastoma multiforme (GBM). We show that SECT features outperform previous topological summary statistics, and alone explain more of the variance in patient survival than gene expression, volumetric features, and morphometric features.Multidimensional persistence modulesJérémy CochoyINRIA / Paris-SudPersistence is a tool which allow to recover topological property of a sampled surface from a point cloud. In the construction, an algebraic object appear : the persistence module. The generalisation of this object with index set of higher dimension still present a challenge for their understanding. This poster will present persistence in the multidimensional setting and some of it's properties.Topology-based metric learningOliver GäfvertKTH Royal Institute of TechnologyMetric learning (or similarity learning) is the task of learning a metric that, in some sense, maximally separates two sets of objects. Using a parametrized class of pseudometrics on persistence modules, derived from what we call persistence contours. We define a scheme for learning a pseudometric on persistence modules that maximally (in some sense) separates between two given sets of persistence modules. In the one-dimensional case this means that we're optimizing the parameters of a pseudometric on the space of barcodes in such a way that it ''maximally'' separates two given sets of barcodes. The resulting metric can then be used for classifying unseen barcodes into the two sets.Persistent chain complexesBarbara GiuntiUniversità di PaviaPersistent homology of the Vietoris-Rips complex has proven to be an effective tool to extract geometric information from data sets. Homology, however, is a drastic simplification and in certain situations might remove too much information. This project aims to investigate invariants, extracted not from the homology but rather from the persistent chain complex. For this approach to be useful, a decomposition structure theorem for persistence chain complexes is necessary. That is the crucial fundament theorem of this project. This presentation aims to describe the indecomposable and present an algorithm of how to enumerate indecomposable summands for a persistent chain complex.Geometric comparison of phylogenetic trees on varying taxa setsGillian GrindstaffThe University of Texas at AustinAs discovered by Billera, Holmes, and Vogtmann in 2001, the set of phylogenetic trees on n taxa form a CAT(0) contractible geodesic metric space, referred to as BHV space, which allows for the adaptation of statistical techniques to sets of phylogenies. However, the problem of comparing or combining trees on different sets of taxa has not been possible using the same methods. Davidson et al. recently created a combinatorial algorithm to extend tree topologies to higher dimensional sectors of BHV space. We adapt their algorithm for metric trees to compute the subspace of extensions of a tree on fewer than n taxa to n-dimensional BHV space. We also define several metrics which can be used to measure the compatibility of multiple metric tree fragments, and the degree to which a collection of fragments can be realized as subtrees of the same tree structure.Holonomic extended least angle regressionMarc HarkonenGeorgia Institute of TechnologyModel selection algorithms are used to automatically choose the statistical model or family of models that best fit our data. For example in generalized linear models each parameter corresponds to one covariate, and then one might ask which of these are actually useful, or the most “impactful” in our fitting procedure, and how do we actually decide which covariates to include in the model. There have been numerous attempts at automatizing this process. Most notably, the Least Angle Regression algorithm, or LARS, by Efron et al. is a computationally efficient algorithm that ranks the covariates of a linear model without being too greedy. The LARS algorithm was extended in 2010 by Hirose and Komaki for a class of distributions in the generalized linear model by using properties of the manifold of exponential families as a dually flat manifold. However this extension assumes that the normalizing constant of the joint distribution of observations is “easy” to compute. This is often not the case, for example the normalizing constant may contain an integral that has no analytic solution.
However, if the normalizing constant satisfies a holonomic system, we can circumvent this issue by using a method inspired by the Holonomic Gradient Method from Nakayama et al. In this poster we present a modification of the holonomic gradient method and add it to the extended LARS algorithm. We call this the holonomic extended least angle regression algorithm, or HELARS. The algorithm was implemented using the statistical software R, and was tested with real and simulated datasets.Matroid Structure in Computational TopologyGregory Henselman-PetrusekPrinceton UniversityThe proposed poster would highlight recent developments in matroid theory, homological algebra, and computation, with a view to statistical topology. Several of the premier applications in this field rest on the decomposition of homological persistence modules into 1-dimensional factors that realize, collectively, the structure of a finitary matroid. The combinatorial properties of this object have profound consequences in statistical topology and homological algebra, with numerous connections to Lie, Hodge, and Gromov-Witten theory. As an elementary application, we present the Eirene library for homological persistence, an open-source platform for persistence module decomposition that outperforms, in several cases of interest, the current state-of-the-art by orders of magnitude in both time and memory.Model Selection by Algebraic CriteriaShaoxiong HuQueen Mary University of LondonPlease see the abstract as PDF file.Statistical tools for persistence diagrams via smoothed Optimal TransportThéo LacombeInria SaclayStandard metrics on persistence diagrams (Bottleneck or Wasserstein) provide a powerful theoretical framework to compare topological descriptors. However, their combinatorial nature makes them hard to handle from a statistical perspective. By leveraging recent progress in optimal transport theory, we provide an approximation of these metrics which presents interesting properties for statistical and numerical purposes. We show applications to the so-called Wasserstein barycenter problem for persistence diagrams and other numerical problems.Activity-based Graphical Models for Time SeriesZhangsheng LaiSingapore University of Design and TechnologyGiven a directed graph, possibly with cycles and self-loops, we define a continuous-time Markov chain where the nodes are binary random variables and where the off-diagonal entries of the transition rate matrix are defined by monomials in the model parameters. We call this Markov chain a McCulloch-Pitts process, and its defining monomials are known as activities. Such processes play an important role in biologically plausible deep learning, because they explain an experimentally-observed phenomenon in mammalian brains known as Spike Timing Dependent Plasticity. The learning rules for the model are simple, local and distributed, so we may finally be able to train massive neural networks. The model is a toric variety in the space of stochastic rate matrices, and learning corresponds to movement along this toric variety to a point that is closest to the empirical distribution by means of multiplicative weight updates. This is joint work with Chris Hillar and Shaowei Lin.Sheaves on configuration spacesJānis LazovskisUniversity of Illinois at ChicagoCombining all the configuration spaces of a manifold gives the Ran space, which has a natural topology on it. There exists a constructible sheaf on this space, which can be described explicitly or by using a duality between such sheaves and functors of exit paths. Further, taking the product of the Ran space with non-zero real numbers gives a sheaf that classifies the Cech (or Vietoris-Rips) complexes that come from point samples of the base manifold. In other words, this sheaf contains the same information as the persistent homology of all point clouds on the manifold.Simplicial networks and effective resistanceKang-Ju LeeSeoul National UniversityWe introduce the notion of effective resistance for a \emph{simplicial network} $(X,R)$ where $X$ is a simplicial complex and $R$ is a set of resistances for the top simplices, and prove two formulas generalizing previous results concerning effective resistance for resistor networks. Our approach, based on combinatorial Hodge theory, is to assign a unique harmonic class to a \emph{current generator} $\sigma$, an extra top-dimensional simplex to be attached to $X$. We will show that the harmonic class gives rise to the \emph{current} $I_{\sigma}$ and the \emph{voltage} $V_{\sigma}$ for $X\cup\sigma$, satisfying Thompson's energy-minimizing principle and Ohm's law for simplicial networks. The effective resistance $R_{\sigma}$ of a current generator $\sigma$ shall be defined as a ratio of the $\sigma$-components of $V_{\sigma}$ and $I_{\sigma}$. By introducing \emph{potential} for voltage vectors, we present a formula for $R_\sigma$ via the inverse of the weighted combinatorial Laplacian of $X$ in codimension one. We also derive a formula for $R_{\sigma}$ via weighted high-dimensional tree-numbers for $X$, providing a combinatorial interpretation for $R_{\sigma}$. As an application, we generalize Foster's Theorem, and discuss various high-dimensional examples. This is a joint work with Woong Kook.Analysis of contagion maps on a class of networks that are spatially embedded in a torusBarbara MahlerUniversity of OxfordThe spreading of real-world contagions is often guided by the geometry of the underlying domain. One example are contagious diseases that historically spread gradually along part of the earth's surface.
In this case contagions propagate as a wave front, passing between geometrically close entities. However, in today's world with modern transport and communication technology there are many scenarios where, even in the presence of a well-defined underlying geometry, such as the spherical surface of the earth, a contagion spreads via connections that are not intrinsically geometric.
Examples of such scenarios include the spreading of an infectious disease via passengers on a long-distance flight, or the dissemination of information via internet communication. Here the contagion jumps across space to distant locations rather that strictly following the geometry of the underlying domain.
Following the methodology introduced in [1], we study such phenomena by considering a threshold contagion model on a family of networks that can be construed as being embedded in a torus and have both `geometric edges' that respect the geometry of the underlying torus, and `non-geometric' edges that are not constrained by the underlying geometry and can connect nodes across a long range. We investigate what propagation pattern a contagion follows and how much this pattern is influenced by the structure of the underlying space.
[1] D. Taylor et al, “Topological data analysis of contagion maps for examining spreading processes on networks,” Nature communications, vol. 6, 2015.Simplifying a simplicial tower for efficient computation of Persistent HomologySiddharth PritamInria, Sophia Antipolis FranceWe present a novel approach to reduce the size of a simplicial tower ( simplicial complexes connected through general simplicial maps) whille preserving their persistent homology. This technique is easily extendable in the case of Zigzag persistence as well.Optimal Bandwidth for Nonparametric Estimation of Density Level SetsWanli QiaoGeorge Mason UniversityBandwidth selection is crucial in the kernel estimation of density level sets. Risk based on the symmetric difference between the estimated and true level sets is usually used to measure their proximity. In this paper we provide an asymptotic Lp approximation to this risk, where p is characterized by the weight function in the risk. In particular the excess risk corresponds to an L2 type of risk, and is adopted in an optimal bandwidth selection rule for nonparametric level set estimation of d-dimensional density functions (d >= 1).Persistence contours for quantifying topological difference in 1D persistent homologyHenri RiihimäkiTampere University of Technology + KTH StockholmDifferent studies applying persistent homology in data analysis are giving evidence that topological features in different scales are important for the analysis task at hand. This is in contrast to the traditional view in persistence putting focus on features with high persistence. We develop tools called persistence contours to quantify topology at different scales with different emphasis. This leads to a simple functional invariant for persistence modules. Persistence contours also give general metrics between persistence objects.Barcode Embeddings, Persistence Distortion, and Inverse Problems for Metric GraphsYitzchak SolomonBrown UniversityStable topological invariants are a cornerstone of persistence theory and applied topology, but their discriminative properties are often poorly-understood. In this paper we demonstrate the generic injectivity of a rich homology-based invariant first defined by Dey, Wang, and Shi, which we think of as embedding a metric graph in the barcode space.A notion of homological clusteringGard SpreemannEPFLSpectral clustering of graphs has long been a standard tool. We propose a similar method of clustering simplicial complexes based on the simplicial Laplacian.Asymptotic Equivalence between nonparametric density estimation and Gaussian white noiseStefan SteinUniversity of WarwickFollowing a paper of Nussbaum (1996) I present the key steps in proving asymptotic equivalence in the Le Cam sense between nonparametric density estimation and Gaussian white noise. The poster will also include a short introduction to Le Cam theory from its classical perspective in terms of risk functions as well as from a category theoretical point of view.Cohomology of real moment-angle complexesElizabeth VidaurreUniversity of RochesterCertain subspaces of a product of spaces whose factors are labeled by the vertices of a simplicial complex are referred to as "polyhedral product spaces". Polyhedral products are given by taking the union of subproducts depending on the face category of a fixed simplicial complex on $m$ vertices and a labelled family of $m$ topological pairs. Such polyhedral products are realized by objects studied in combinatorics, commutative algebra and algebraic geometry. There are applications of polyhedral products to topological robotics.Offset hypersurfaces and persistent homology of algebraic varietiesMadeleine WeinsteinUC BerkeleyWe present connections between the true persistent homology of algebraic varieties and their corresponding offset hypersurfaces. By Hardt's theorem, true persistent homology events occur at algebraic numbers. By studying the offset hypersurface, we characterize when true persistent homology events occur. The degree of the offset hypersurface is bounded by the Euclidean Distance Degree of the starting variety, so this way we show a bound on the degree of the algebraic numbers at which the homological events occur.
I will present an overview of the many and varied applications of topology to neuroscience on which we have been working at the EPFL over the past three years.
Convex sensing, i.e. sensing by quasiconvex functions, naturally arises in neuroscience and other contexts. One example of a convex sensing problem is measuring the underlying dimension of data. A related problem is computing the ''nonlinear rank'', i.e. the minimal rank of a matrix modulo the action by the group of row-wise nonlinear monotone transformations. While an exact algorithm for computing the nonlinear rank is unknown, it can be efficiently estimated using topological methods, as it is closely related to the geometry and topology of hyperplane arrangements.
A natural tool that captures the essence of convex sensing problems is directed complexes, which capture much of the relevant geometric information. For example, the nonlinear rank, as well as other geometric properties of data can be estimated from the homology of an associated directed complex. I will present recent results and conjectures about the directed complexes associated to some convex sensing problems.
In 2013 Chan, Carlsson, and Rabadan introduced a topological approach to the study of genetic recombination, based on persistent homology. In this work, we develop theoretical foundations for this. First, we clarify the motivation for the topological approach by presenting a novel formulation of the underlying inference problem. Specifically, we introduce and study the novelty profile, a simple, stable statistic of an evolutionary history which not only counts recombination events but also quantifies how recombination creates genetic diversity. We propose that the (hitherto implicit) goal of the topological approach to recombination is the estimation of novelty profiles.
We then develop a theory for the estimation of novelty profiles using barcodes. We focus on a low-recombination regime, where the ancestry of a population-genetic sample can be described using directed acyclic graphs called galled trees, which differ from trees only by isolated topological defects. We show that in this regime, under a complete sampling assumption, the 1st persistence barcode of evolutionary data yields a lower bound on the novelty profile, and hence on the number of historical recombination events. For i>1, the ith barcode is empty. In addition, we use the stability of persistent homology to extend these results to ones which hold for an arbitrary subsample of an arbitrary evolutionary history. We establish these results as corollaries of new mathematical results about the topology of certain Vietoris-Rips filtrations, which may be of independent interest.
Joint work with Raul Rabadan and Daniel Rosenbloom
Partial least squares (PLS) is a method to discover a functional dependence between two sets of variables X and Y. PLS attempts to maximize the covariance between X and Y by projecting both onto new subspaces. Higher order partial least squares (HOPLS) comes into play when the sets of variables have additional tensorial structure. Simultaneous optimization of subspace projections may be obtained by a multilinear singular value decomposition (MSVD). I'll review PLS and SVD, and explain their higher order counterparts. Finally I'll describe recent work with G. Deshpande, A. Cichocki, D. Rangaprakash, and X.P. Hu where we propose to use HOPLS and Tensor Decompositions to discover latent linkages between EEG and fMRI signals from the brain, and ultimately use this to drive Brain Computer Interfaces (BCI)'s with the low size, weight and power of EEG, but with the accuracy of fMRI.
In low-rank optimization one seeks to minimize functions on manifolds or varieties of low-rank matrices or tensors. This has useful applications in signal and image processing, or in high-dimensional scientific computing. Due to their parametrization by multilinear maps, sets of low-rank matrices and tensors exhibit a rich geometric structure, which sometimes makes it possible to go beyond generic results in the analysis of corresponding nonlinear optimization methods on these sets. This concerns for example necessary first-order optimality conditions or classification of critical points. In this talk we discuss some of these questions.
Participants
Carlos Améndola
TU Munich
Eleonora Andreotti
University of Bologna
Djordje Baralic
Mathematical Institute SANU
Ulrich Bauer
Technical University of Munich
Nicolas Berkouk
INRIA Saclay
Nello Blaser
University of Bergen
Magnus Botnan
Technical University of Munich
Madeline Brandt
University of California, Berkeley
Paul Breiding
MPI MIS Leipzig
Morten Brun
University of Bergen
Mickaël Buchet
TU Graz
Johannes Bühl
Leibniz Institute for Tropospheric Research (TROPOS)
Ben Cassidy
Columbia University
Andrew Chen
Columbia University
Jérémy Cochoy
INRIA / Paris-Sud
Alessio D'Alì
MPI MiS Leipzig
Alessandro De Gregorio
Politecnico di Torino
Stefania Ebli
EPFL
Herbert Edelsbrunner
Institute of Science and Technology Austria
Daniela Egas Santander
EPFL
Marzieh Eidi
Max Planck Institute for Mathematics in the sciences
Yury Elkin
University of Liverpool
Alperen Ali Ergur
TU Berlin
Domenico Felice
MPI MSI
Ioan Filip
Columbia University
Ulderico Fugacci
TU Graz
Oliver Gäfvert
KTH Royal Institute of Technology
Barbara Giunti
Università di Pavia
Christiane Görgen
Max Planck Institute for Mathematics in the Sciences
Paul Görlach
Max Planck Institute Leipzig
Dejan Govc
University of Aberdeen
Gillian Grindstaff
The University of Texas at Austin
Marc Harkonen
Georgia Institute of Technology
Heather Harrington
University of Oxford
Corey Harris
MPI Leipzig
Gregory Henselman-Petrusek
Princeton University
Kathryn Hess
École polytechnique fédérale de Lausanne
Shaoxiong Hu
Queen Mary University of London
Stephan Huckemann
University of Göttingen
Vladimir Itskov
The Pennsylvania State University
Simon Johanning
University of Leipzig
Jürgen Jost
Max Planck Institute for Mathematics in the Sciences
Siargey Kachanovich
Inria Sophia Antipolis Méditerranée
Sara Kališnik Verovšek
Max Planck Institute for Mathematics in the Sciences
Giorgi Khimshiashvili
Ilia State University
Théo Lacombe
Inria Saclay
Zhangsheng Lai
Singapore University of Design and Technology
Pierre Lairez
INRIA Saclay Île-de-France
Jānis Lazovskis
University of Illinois at Chicago
Kang-Ju Lee
Seoul National University
Michael Lesnick
Princeton University
Davorin Lešnik
University of Ljubljana
Jose Luis Licon Salaiz
University of Cologne
Vlada Limic
Université de Strasbourg
Barbara Mahler
University of Oxford
Orlando Marigliano
University of Bonn
Pascal Massart
Université de Paris-Sud
Stephan Mescher
Universitaet Leipzig
Mateusz Michalek
MPI MiS
Anthea Monod
Columbia University
Raffaella Mulas
Max Planck Institute for Mathematics in the Sciences
Nicolas Ninin
EPFL
Luke Oeding
Auburn University
Beatriz Pascual Escudero
Universidad Autónoma de Madrid
Alice Patania
Indiana University
Max Pfeffer
MPI MIS Leipzig
Wolfgang Polonik
University of California at Davis
Siddharth Pritam
Inria, Sophia Antipolis France
Wanli Qiao
George Mason University
Henri Riihimäki
Tampere University of Technology + KTH Stockholm
Olivia Röhrig
TU Berlin
Tim Römer
Universität Osnabrück
Mahsa Sayyary Namin
MPI MIS
Hartwig Senska
KIT (Karlsruher Institut für Technologie)
Emre Sertöz
MPI MiS
Tim Seynnaeve
Max Planck Institute for Mathematics in the Sciences
Kristin Shaw
MPI MIS
Primoz Skraba
University of Primorska
Yitzchak Solomon
Brown University
Gard Spreemann
EPFL
Stefan Stein
University of Warwick
Bernd Sturmfels
Max Planck Institute for Mathematics in the Sciences
Jacinta Torres
Max Planck Institute for Mathematics in the Sciences
André Uschmajew
Max Planck Institute for Mathematics in the Sciences
Francesco Vaccarino
Politecnico di Torino
Elizabeth Vidaurre
University of Rochester
Max Wardetzky
University of Göttingen
Madeleine Weinstein
UC Berkeley
Felix Wolf
Universität Bonn
Haoqing Wu
EPFL
Marius Yamakou
Max Planck Institute for Mathematics in the Sciences
Sylvia Yaptieu
Max-Planck-Institut for Mathematics in the Sciences
Jun Zhang
University of Michigan
Pascal Zschumme
Karlsruhe Institute of Technology
Scientific Organizers
Christiane Görgen
Max-Planck-Institut für Mathematik in den Naturwissenschaften
Sara Kališnik Verovšek
Max-Planck-Institut für Mathematik in den Naturwissenschaften
Vlada Limic
Université de Strasbourg and CNRS, Paris
Administrative Contact
Saskia Gutzschebauch
Max-Planck-Institut für Mathematik in den Naturwissenschaften
Contact via Mail