We have decided to discontinue the publication of preprints on our preprint server as of 1 March 2024. The publication culture within mathematics has changed so much due to the rise of repositories such as ArXiV (www.arxiv.org) that we are encouraging all institute members to make their preprints available there. An institute's repository in its previous form is, therefore, unnecessary. The preprints published to date will remain available here, but we will not add any new preprints here.
MiS Preprint
39/2010
Mixture Decomposition of Distributions using a Decomposition of the Sample Space
Guido Montúfar
Abstract
We consider the set of join probability distributions of $N$ binary random variables which can be written as a sum of $m$ distributions in the following form $p(x_1,\ldots,x_N)=\sum_{i=1}^m \alpha_i f_i(x_1,\ldots,x_N)$, where $\alpha_i \geq 0$, $\sum_{i=1}^m \alpha_i =1$, and the $f_i(x_1,\ldots,x_N)$ belong to some exponential family.
For our analysis we decompose the sample space into portions on which the mixture components $f_i$ can be chosen arbitrarily. We derive lower bounds on the number of mixture components from a given exponential family necessary to represent distributions with arbitrary correlations up to a certain order or to represent any distribution.
For instance, in the case where $f_i$ are independent distributions we show that every distribution $p$ on $\{0,1\}^N$ is contained in the mixture model whenever $m\geq 2^{N-1}$, and furthermore, that there are distributions which are not contained in the mixture model whenever $m<2^{N-1}$.