seminar
20.09.24

# MiS Mathematics Lab Seminar

### Previous Events

 22.03.24 Peter Stadler (IZBI, Leipzig University) Gene Family Histories and Graph Data in Phylogenetics Genes evolve within their species by means gene duplication, gene loss, and occasional horizontal gene transfer, leading to a family of related genes distributed over a set of species. In each speciation event, all genes are faithfully transmitted into the separating lineages. As long as horizontal transfer is the exception rather than the rule for most gene families, species are well-defined and, like their genes, evolve along trees. The history of a gene family is then defined as the mapping of the gene tree $T$ into the species tree $S$, such that inner vertices that represent speciations in $T$ are mapped to inner vertices of $T$. Such evolutionary scenarios can be used to defined several useful vertex-colored graphs that represent partial information on the gene family history. For example, in the orthology graph two vertices are adjacent if the corresponding genes have a speciation event as their last common ancestor. In the best match graph, a directed edge connects $x$ and $y$ from different species, if $y$ is, in its species, a closest relative of $x$. In the LDT graph, $x$ and $y$ are adjacent, if their last common ancestor is younger than the last common ancestor of the species in which they reside. The interest in these and other graphs stems from the fact that they can inferred more or less directly from sequence similarity data without the need to construct the gene tree $T$ or the species $S$. We discuss to what extent gene family histories are determined by these graphs. Since these graphs have very specific mathematical structures, correcting empirical estimates to conform to these structures provides a powerful way of reducing noise in the data. Taken together this suggest a graph-based approach to gene family histories that avoids many of the practical issues with classical phylogenetic approaches that require accurate gene and species trees as a first step. 22.03.24 Bruno J. Schmidt (IZBI, Leipzig University + MPI MiS, Leipzig) Pre-talk: Sorting by TDRL and iTDRL Tandem duplication random loss (TDRL) and inverse tandem duplication random loss (iTDRL) are mechanisms of mitochondrial genome rearrangement that can be modeled as simple operations on signed permutations. Informally, they comprise the duplication of a subsequence of a permutation, where in the case of iTDRL the copy is inserted with inverted order and signs. In the second step, one copy of each each duplicate element is removed, such that the result is again a signed permutation. The TDRL/iTDRL sorting problem consists in finding the minimal number of TDRL or iTDRL operations necessary to convert the identity permutation $\iota$ into a given permutation $\pi$. We introduce a simple signature, called the misc-encoding, of permutation $\pi$. This construction is used to design an $\mathcal{O}(n\log n)$ algorithm to solve the TDRL/iTDRL sorting problem.

### Diaaeldin Taha

MPI for Mathematics in the Sciences