Identification of sequences from partial sequence information
Abstract
Subwords in reverse complement order
Given two natural number n,i, let F(n,i) denote the set of all strictly monotonous maps from {1,2,...,i} into {1,2,...,n}.
Further, given an alphabet A:={a,a',b,b'} and a sequence s = s(1) s(2) ... s(n) of length n over A, let S' denote the sequence s':= s(n)' s(n-1)' ... s(1)' with x':= a' for x=a, x':= a for x=a', x':= b' for x=b, and x':= b for x=b'.
Finally, with n,i, A, and s as above, let Subword(s|i) denote the set Subword(s|i) := {s(f(1)) ... s(f(i)) : f in F(n,i)}.
Then, one has s=t or s=t' for any two sequences s,t of length n with entries from A if and only of the union of the two sets Subword(s|[2n/3]) and Subword(s'|[2n/3]) coincides with the union of the two sets
Subword(t|[2n/3]) and Subword(t'|[2n/3]).