Tropical Sufficient Statistics for Persistent Homology
- Anthea Monod (Columbia University)
Abstract
Topological data analysis has been shown to be an innovation in studying genetic reassortment events in RNA viruses --- a key factor in studying viral mutations, such as those in HIV. In this talk, I will show how a sufficient statistic for an important invariant in topological data analysis, persistence barcodes, constructed using tropical geometry allows for classical probability distributions to be assumed on the tropicalized representation of barcodes. This makes a variety of parametric statistical inference methods amenable to barcodes, all while maintaining their initial interpretations. More specificially, exponential family distributions may be assumed and likelihood functions for persistent homology may be constructed. We conceptually demonstrate sufficiency and illustrate its utility in dimensions 0 and 1, with a rigorous parametric statistical analysis of inter- and intra-subtype reassortment events in both HIV and avian influenza. This is joint work with Sara Kalisnik Verovsek, Juan Angel Patino-Galindo and Lorin Crawford, and supported by the US NIH under project number R01GM117591.