Bayesian learning of stochastic interactions
- Pierre-Yves Bourguignon (MPI MiS Leipzig)
Abstract
Due to the limited size of available datasets, performing proper inference of interactions between sites in related biological sequences (protein families, regulatory sites, etc...) is a key aspect in various recurrent problems in bioinformatics, such as gene prediction, modelling of transcription factor binding sites, protein classification, etc...
In 2001, S.I. Amari introduced a very general framework for decomposing stochastic interactions between sites in a hierarchical manner when the joint distribution is given; he published later a statistical extension of this framework, allowing to assess the presence of significant interactions as reflected by some data, using a frequentist approach: interactions of increasing order are iteratively tested.
On the other hand, bayesian statistics have proven their efficiency at performing such model selection tasks both theoretically and practically, for instance in the framework of text compression where the structure of a context tree has to be inferred. My project is to combine the merits of Amari's approach with bayesian model selection procedures.
After an introduction to the biological motivation of this work, I will present bayesian model selection procedures and comment on their rational; future lines of work aimed at applying this theory to Amari's model will conclude the lecture.