Large deviations for optimal local alignment scores of independent sequences

  • Steffen Grossmann (University of Frankfurt)
  • Benjamin Yakir
Großer Saal Alte Handelsbörse Leipzig (Leipzig)


The distribution function of the optimal local gapped alignment

score tex2html_wrap_inline14 of two

long unrelated DNA sequences of length n is widely believed to be

of the Gumbel type

tex2html_wrap_inline18. Whereas this is well known

for gapless alignment

(Dembo et al., 1994), a rigorous justification of this belief in the

gapped case has only been given in special cases (e.g. Siegmund/Yakir, 2000).

The two parameters K and tex2html_wrap_inline22 of course depend on the scoring scheme,

but numerically useable representations are available in these cases.


We look at


in the large

deviations regime, that is where tex2html_wrap_inline26. We give a connection

between this large deviations rate and the growth constant of the

expectation of tex2html_wrap_inline14 in the logarithmic regime (Arratia/Waterman,

1994). We also provide a new representation for this limit via a characteristic

equation for moment-generating functions of optimal global

alignment scores. This representation is similar to the one known

from gapless alignment.

Assuming the correctness of the Gumbel form, this gives a new representation

for the parameter tex2html_wrap_inline22.

17.09.01 20.09.01

Genes, Cells, Populations - Mathematics and Biology

Alte Handelsbörse Leipzig Großer Saal

A. Greven

A. v.Haeseler

Angela Stevens

A. Wakolbinger