01.04.2015 Views

Sequence Comparison.pdf

Sequence Comparison.pdf

Sequence Comparison.pdf

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

98 6 Anatomy of Spaced Seeds<br />

6.2.3 Two Inequalities<br />

In this section, we establish two inequalities on hit probability. As we shall see in the<br />

following sections, these two inequalities are very useful for comparison of spaced<br />

seeds in asymptotic limit.<br />

Theorem 6.2. Let π be a spaced seed and n > |π|. Then, for any 2|π|−1 ≤ k ≤ n,<br />

(i) π k ¯Π n−k+|π|−1 ≤ π n ≤ π k ¯Π n−k .<br />

(ii) ¯Π k ¯Π n−k+|π|−1 ≤ ¯Π n < ¯Π k ¯Π n−k .<br />

Proof. (i). Recall that A i denotes the event that seed π hits the random sequence R<br />

at position i−1 and Ā i the complement of A i . Set Ā i, j = Ā i Ā i+1 ...Ā j . By symmetry,<br />

π n = Pr [ A |π| Ā |π|+1,n<br />

]<br />

. (6.7)<br />

The second inequality of fact (i) follows directly from that the event A |π| Ā |π|+1,n is<br />

a subevent of A |π| Ā |π|+1,k Ā k+|π|,n for any |π| + 1 < k < n. The first inequality in fact<br />

(i)isprovedasfollows.<br />

Let k be an integer in the range from 2|π|−1ton. For any 1 ≤ i ≤|π|−1, let<br />

S i be the set of all length-i binary strings. For any w ∈ S i ,weuseE w to denote<br />

the event that R[k −|π| + 2,k −|π| + i + 1] =w in the random sequence R. With<br />

ε being the empty string, E ε is the whole sample space. Obviously, it follows that<br />

E ε = E 0 ∪E 1 . In general, for any string w of length less than |π|−1, E w = E w0 ∪E w1<br />

and E w0 and E w1 are disjoint. By conditioning to A |π| Ā |π|+1,n in formula (6.7), we<br />

have<br />

π n = ∑ Pr[E w ]Pr [ ]<br />

A |π| Ā |π|+1,k Ā k+1,n |E w<br />

w∈S |π|−1<br />

= ∑ Pr[E w ]Pr [ ] ]<br />

A |π| Ā |π|+1,k |E w Pr<br />

[Āk+1,n |E w<br />

w∈S |π|−1<br />

where the last equality follows from the facts: (a) conditioned on E w , with w ∈<br />

S |π|−1 , the event A |π| Ā |π|+1,k is independent of the positions beyond position k, and<br />

(b) Ā k+1,n is independent of the first k −|π| + 1 positions. Note that<br />

π k = Pr [ A |π| Ā |π|+1,k<br />

]<br />

= Pr<br />

[<br />

A|π| Ā |π|+1,k |E ε<br />

]<br />

and<br />

¯Π n−k+|π|−1 = Pr [ Ā k+1,n |E ε<br />

]<br />

.<br />

Thus, we only need to prove that<br />

∑ Pr[E w ]Pr[A |π| Ā |π|+1,k |E w ]Pr[Ā k+1,n |E w ]<br />

w∈S j<br />

≥ ∑ Pr[E w ]Pr[A |π| Ā |π|+1,k |E w ]Pr[Ā k+1,n |E w ] (6.8)<br />

w∈S j−1

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!