The quartet distance between phylogenetic trees and an application beyond phylogenetics

  • Stefan Grünewald (CAS-MPG Partner Institute for Computational Biology (PICB), Shanghai, China)
E1 05 (Leibniz-Saal)


In phylogenetic data analysis, using different methods or different data sets frequently results in different trees with the same label set. Further, some heuristic optimization algorithms require the construction of a different but not too different tree from a given one, in order to find a (locally) optimal tree. Therefore, we have to quantify the dissimilarity between two phylogenetic trees with identical leaf sets.

One common measure for this is the quartet distance. It is defined to be the number of sets of exactly four taxa for which the trees have different restrictions. Bandelt and Dress showed in 1986 that the maximum distance between two binary trees, when normalized by the number of all 4-sets, is monotone decreasing with n. They conjectured that the limit of this ratio is 2/3 which corresponds to the distance between two random trees.

I will give a generalization of the conjecture to arbitrary X-trees and show some partial results. As a consequence, we find that quartets can be used to quantify the dependence between two real-valued random variables or between two data sets. In some sense, this measure of dependence corresponds to Kendall's tau to measure correlation.

Antje Vandenberg

Max Planck Institute for Mathematics in the Sciences Contact via Mail

Jürgen Jost

Max-Planck-Institut für Mathematik in den Naturwissenschaften

Peter Stadler

Leipzig University