Search

Workshop

Large deviations for optimal local alignment scores of independent sequences

  • Steffen Grossmann (University of Frankfurt)
  • Benjamin Yakir
Großer Saal Alte Handelsbörse Leipzig (Leipzig)

Abstract

The distribution function of the optimal local gapped alignment score Mn of two long unrelated DNA sequences of length n is widely believed to be of the Gumbel type $F(x)=\exp (-n^2 K \exp (- \lambda x ))$. Whereas this is well known for gapless alignment (Dembo et al., 1994), a rigorous justification of this belief in the gapped case has only been given in special cases (e.g. Siegmund/Yakir, 2000). The two parameters K and Λ of course depend on the scoring scheme, but numerically useable representations are available in these cases.\r\nWe look at $\lim\limits_{i \to \infty} -\frac{1}{i} \log P (M_n \geq t)$ in the large deviations regime, that is where n = o(ef). We give a connection between this large deviations rate and the growth constant of the expectation of Mn in the logarithmic regime (Arratia/Waterman, 1994). We also provide a new representation for this limit via a characteristic equation for moment-generating functions of optimal global alignment scores. This representation is similar to the one known from gapless alignment. Assuming the correctness of the Gumbel form, this gives a new representation for the parameter Λ.

conference
17.09.01 20.09.01

Genes, Cells, Populations - Mathematics and Biology

Alte Handelsbörse Leipzig Großer Saal

A. Greven

A. v.Haeseler

Angela Stevens

A. Wakolbinger