A simple algorithm for discovering causal structure in high-dimensional data

  • Korbinian Strimmer (Institut für medizinische Informatik, Statistik und Epidemiologie, Universität Leipzig, Leipzig, Germany)
  • Rainer Opgen-Rhein, LMU
A3 01 (Sophus-Lie room)


A simple versatile statistical learning algorithm for inferring large-scale causal networks is proposed. The basis of this procedure is a decomposition of the regression coefficient into the product of the square root of the ratio of standardized partial variances, the partial correlation, and a scale factor. Multiple testing of the log-ratio of standardized partial variances produces an estimate of the partial ordering of the nodes, which is subsequently projected on the underlying partial correlation graph. As a result, a (partially) causal network is recovered, using only the positive definite pairwise correlation matrix as input. The statistical and computational properties of the new approach are investigated, and illustrated by analyzing a large-scale Arabidopis thaliana gene expression time series experiment.

Antje Vandenberg

Max-Planck-Institut für Mathematik in den Naturwissenschaften Contact via Mail

Nihat Ay

Max Planck Institute for Mathematics in the Sciences