Overestimating correlation between typological features

  • Michael Dunn (Uppsala University, Uppsala, Sweden)
Hörsaal Max-Planck-Institut für Evolutionäre Anthropologie (Leipzig)


Phylogenetic correlation is important: a major concern of functional/typological approaches to linguistics is the investigation of the causal interactions between aspects of language structure. Modern phylogenetic comparative methods offer new ways to address these questions. Dunn et al. (2011) use these methods in one such attempt: a large scale study of word order variation in four language families showed that many so-called "universals" of language structure were not reliably detectable in all language families, belying their universal status. Beyond the evidence for the possible non-universality of language universals, this paper also presented instances of apparent correlated evolutionary change which violated typological predictions.

Maddison and Wayne (2015) enumerate some serious methodological challenges outstanding in the investigation of phylogenetic correlation, and identify a series of phylogenetic contexts where standard phylogenetic comparative methods such as Pagel (1994) perform poorly. For example, standard phylogenetic tests of correlation between two characters tend to overestimate the degree of correlation where there exists some third factor driving change in one character, but which is co-distributed with the other by chance. These (and other) challenges to the method don not undermine the entire enterprise, but there is clearly a need for improved statistical tests.

In this paper I investigate the inferred histories of word order features and attempt to classify our previously identified correlations between word order characters into "safe" and the various "at risk" categories. In some of these latter "at risk" cases of apparent correlation historical linguistics can give us a more particular and detailed account of diachronic processes of change. Through investigating these known cases I hope to clarify where phylogenetic tests of correlation are performing well on linguistic data, and to contribute to future efforts to solve the problem of inferring correlation where current methods perform poorly.

Dunn, Michael, Simon J. Greenhill, Stephen C. Levinson, and Russell D. Gray. 2011. Evolved structure of language shows lineage-specific trends in word-order universals. Nature 473:79–82.
Maddison, Wayne P., and Richard G. FitzJohn. 2015. The unsolved challenge to phylogenetic correlation tests for categorical characters. Systematic Biology 64(1): 127–36.
Pagel Mark. 1994. Detecting correlated evolution on phylogenies: a general method for the comparative analysis of discrete characters. Proceedings of the Royal Society B: Biological Sciences. 255:37–45.

Antje Vandenberg

Max Planck Institute for Mathematics in the Sciences Contact via Mail

Damián Blasi

MPI for Mathematics in the Sciences and MPI for Evolutionary Anthropology (Leipzig), Germany

Jürgen Jost

Max Planck Institute for Mathematics in the Sciences (Leipzig), Germany

Peter Stadler

Leipzig University, Germany

Russell Gray

Max Planck Institute for Human History (Jena), Germany

Bernard Comrie

Max Planck Institute for Evolutionary Anthropology (Leipzig), Germany

Stephen C. Levinson

MPI for Psycholinguistics (Nijmegen), Netherlands

Nihat Ay

Max Planck Institute for Mathematics in the Sciences (Leipzig), Germany

Sean Roberts

Max Planck Institute for Psycholinguistics (Nijmegen), Netherlands

Leonardo Lancia

Max Planck Institute for Evolutionary Anthropology (Leipzig), Germany