Biological Applications of Mutual Information
- Peter Grassberger (Max-Planck-Institut für Physik komplexer Systeme, Germany)
I will first recall main features of the mutual information (MI) as a measure for similarity or statistical dependence. In particular, I will discuss its embodiments in two versions of information theory: Probabilistic (Shannon) versus algorithmic (Kolmogorov). I will compare two different strategies for estimating the algorithmic complexity of "texts", one involving sequence alignment and file compression ("zipping"), the other just zipping alone. When applied to mitochondrial DNA, both versions will lead to distance measures which outperform other distance currently in use. The last part of the talk will be devoted to estimating Shannon MI from real-valued data, and an application to microarray gene expression measurements. In particular, I will show that large differences between dependencies estimated from MI and from linear correlation measures may hint to interesting structures which can then be further explored.