MPI for Mathematics in the Sciences

A3 01 (Sophus-Lie room)

seminar

22.03.24 30.06.25

MiS Mathematics Lab Seminar

seminar

22.03.24 30.06.25

MiS Mathematics Lab Seminar

- Past Events


30.06.25	Dexiong Chen (Max-Planck-Institut for Biochemistry) : Towards graph foundation models: from language modeling to graph modeling Graph learning has witnessed rapid progress in recent years. While message passing neural networks (MPNNs) remain dominant, Graph Transformers (GTs) have emerged as promising alternatives, offering global context and scalability—traits that have driven breakthroughs in language modeling. In this talk, I will revisit a representative Graph Transformer model from 2022 and discuss why it is unlikely to be the final solution for graph representation learning. I will then introduce alternative approaches, particularly random walk–based models, that address key limitations of both MPNNs and GTs. Building on recent advances in language modeling, I will show how to capture long-range dependencies within random walks to construct more expressive models. Finally, I will present how random walks can be leveraged for graph generation, illustrating how graphs can be modeled analogously to language through reversible transformations between graphs and random sequences. This approach opens new possibilities for graph foundation models that could transform how we understand and generate complex graph-structured data.

30.06.25

Dexiong Chen (Max-Planck-Institut for Biochemistry) : Towards graph foundation models: from language modeling to graph modeling

Graph learning has witnessed rapid progress in recent years. While message passing neural networks (MPNNs) remain dominant, Graph Transformers (GTs) have emerged as promising alternatives, offering global context and scalability—traits that have driven breakthroughs in language modeling. In this talk, I will revisit a representative Graph Transformer model from 2022 and discuss why it is unlikely to be the final solution for graph representation learning. I will then introduce alternative approaches, particularly random walk–based models, that address key limitations of both MPNNs and GTs. Building on recent advances in language modeling, I will show how to capture long-range dependencies within random walks to construct more expressive models. Finally, I will present how random walks can be leveraged for graph generation, illustrating how graphs can be modeled analogously to language through reversible transformations between graphs and random sequences. This approach opens new possibilities for graph foundation models that could transform how we understand and generate complex graph-structured data.


20.09.24	Vincent Grande (RWTH Aachen) : Hodge Learning: Relating global topology and local features of point clouds. Topological Data Analysis (TDA) allows us to extract powerful topological and higher-order information on the global shape of a data set or point cloud. Tools like Persistent Homology or the Euler Transform give a single complex description of the global structure of the point cloud. However, sometimes, we are interested in translating this global information back to local node-level features, where the individual points have some real-world meaning. This talk will be about how we can achieve this using the Hodge Laplacian and concepts from TDA, Topological Signal Processing, and Differential Geometry. I will also talk about what we can learn from persistent homology by varying the distance function on the underlying space and analysing the corresponding shifts in the persistence diagrams.
22.03.24	Peter Stadler (IZBI, Leipzig University) : Gene Family Histories and Graph Data in Phylogenetics Genes evolve within their species by means gene duplication, gene loss, and occasional horizontal gene transfer, leading to a family of related genes distributed over a set of species. In each speciation event, all genes are faithfully transmitted into the separating lineages. As long as horizontal transfer is the exception rather than the rule for most gene families, species are well-defined and, like their genes, evolve along trees. The history of a gene family is then defined as the mapping of the gene tree $T$ into the species tree $S$ , such that inner vertices that represent speciations in $T$ are mapped to inner vertices of $T$ . Such evolutionary scenarios can be used to defined several useful vertex-colored graphs that represent partial information on the gene family history. For example, in the orthology graph two vertices are adjacent if the corresponding genes have a speciation event as their last common ancestor. In the best match graph, a directed edge connects $x$ and $y$ from different species, if $y$ is, in its species, a closest relative of $x$ . In the LDT graph, $x$ and $y$ are adjacent, if their last common ancestor is younger than the last common ancestor of the species in which they reside. The interest in these and other graphs stems from the fact that they can inferred more or less directly from sequence similarity data without the need to construct the gene tree $T$ or the species $S$ . We discuss to what extent gene family histories are determined by these graphs. Since these graphs have very specific mathematical structures, correcting empirical estimates to conform to these structures provides a powerful way of reducing noise in the data. Taken together this suggest a graph-based approach to gene family histories that avoids many of the practical issues with classical phylogenetic approaches that require accurate gene and species trees as a first step.
22.03.24	Bruno J. Schmidt (IZBI, Leipzig University + MPI MiS, Leipzig) : Pre-talk: Sorting by TDRL and iTDRL Tandem duplication random loss (TDRL) and inverse tandem duplication random loss (iTDRL) are mechanisms of mitochondrial genome rearrangement that can be modeled as simple operations on signed permutations. Informally, they comprise the duplication of a subsequence of a permutation, where in the case of iTDRL the copy is inserted with inverted order and signs. In the second step, one copy of each each duplicate element is removed, such that the result is again a signed permutation. The TDRL/iTDRL sorting problem consists in finding the minimal number of TDRL or iTDRL operations necessary to convert the identity permutation $ι$ into a given permutation $π$ . We introduce a simple signature, called the misc-encoding, of permutation $π$ . This construction is used to design an $O (n \log n)$ algorithm to solve the TDRL/iTDRL sorting problem.

20.09.24

Vincent Grande (RWTH Aachen) : Hodge Learning: Relating global topology and local features of point clouds.

Topological Data Analysis (TDA) allows us to extract powerful topological and higher-order information on the global shape of a data set or point cloud. Tools like Persistent Homology or the Euler Transform give a single complex description of the global structure of the point cloud. However, sometimes, we are interested in translating this global information back to local node-level features, where the individual points have some real-world meaning. This talk will be about how we can achieve this using the Hodge Laplacian and concepts from TDA, Topological Signal Processing, and Differential Geometry. I will also talk about what we can learn from persistent homology by varying the distance function on the underlying space and analysing the corresponding shifts in the persistence diagrams.

22.03.24

Peter Stadler (IZBI, Leipzig University) : Gene Family Histories and Graph Data in Phylogenetics

Genes evolve within their species by means gene duplication, gene loss, and occasional horizontal gene transfer, leading to a family of related genes distributed over a set of species. In each speciation event, all genes are faithfully transmitted into the separating lineages. As long as horizontal transfer is the exception rather than the rule for most gene families, species are well-defined and, like their genes, evolve along trees. The history of a gene family is then defined as the mapping of the gene tree

T

into the species tree

S

, such that inner vertices that represent speciations in

T

are mapped to inner vertices of

T

. Such evolutionary scenarios can be used to defined several useful vertex-colored graphs that represent partial information on the gene family history. For example, in the orthology graph two vertices are adjacent if the corresponding genes have a speciation event as their last common ancestor. In the best match graph, a directed edge connects

x

and

y

from different species, if

y

is, in its species, a closest relative of

x

. In the LDT graph,

x

and

y

are adjacent, if their last common ancestor is younger than the last common ancestor of the species in which they reside. The interest in these and other graphs stems from the fact that they can inferred more or less directly from sequence similarity data without the need to construct the gene tree

T

or the species

S

. We discuss to what extent gene family histories are determined by these graphs. Since these graphs have very specific mathematical structures, correcting empirical estimates to conform to these structures provides a powerful way of reducing noise in the data. Taken together this suggest a graph-based approach to gene family histories that avoids many of the practical issues with classical phylogenetic approaches that require accurate gene and species trees as a first step.

22.03.24

Bruno J. Schmidt (IZBI, Leipzig University + MPI MiS, Leipzig) : Pre-talk: Sorting by TDRL and iTDRL

Tandem duplication random loss (TDRL) and inverse tandem duplication random loss (iTDRL) are mechanisms of mitochondrial genome rearrangement that can be modeled as simple operations on signed permutations. Informally, they comprise the duplication of a subsequence of a permutation, where in the case of iTDRL the copy is inserted with inverted order and signs. In the second step, one copy of each each duplicate element is removed, such that the result is again a signed permutation. The TDRL/iTDRL sorting problem consists in finding the minimal number of TDRL or iTDRL operations necessary to convert the identity permutation

ι

into a given permutation

π

. We introduce a simple signature, called the misc-encoding, of permutation

π

. This construction is used to design an

O (n \log n)

algorithm to solve the TDRL/iTDRL sorting problem.

MiS Mathematics Lab Seminar

MiS Mathematics Lab Seminar

Previous Events