Quantifying Genetic Innovation: Mathematical Foundations for the Topological Study of Reticulate Evolution
- Michael Lesnick (Princeton University)
Abstract
In 2013 Chan, Carlsson, and Rabadan introduced a topological approach to the study of genetic recombination, based on persistent homology. In this work, we develop theoretical foundations for this. First, we clarify the motivation for the topological approach by presenting a novel formulation of the underlying inference problem. Specifically, we introduce and study the novelty profile, a simple, stable statistic of an evolutionary history which not only counts recombination events but also quantifies how recombination creates genetic diversity. We propose that the (hitherto implicit) goal of the topological approach to recombination is the estimation of novelty profiles.
We then develop a theory for the estimation of novelty profiles using barcodes. We focus on a low-recombination regime, where the ancestry of a population-genetic sample can be described using directed acyclic graphs called galled trees, which differ from trees only by isolated topological defects. We show that in this regime, under a complete sampling assumption, the 1st persistence barcode of evolutionary data yields a lower bound on the novelty profile, and hence on the number of historical recombination events. For i>1, the ith barcode is empty. In addition, we use the stability of persistent homology to extend these results to ones which hold for an arbitrary subsample of an arbitrary evolutionary history. We establish these results as corollaries of new mathematical results about the topology of certain Vietoris-Rips filtrations, which may be of independent interest.
Joint work with Raul Rabadan and Daniel Rosenbloom