01.04.2015 Views

Sequence Comparison.pdf

Sequence Comparison.pdf

Sequence Comparison.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

94 6 Anatomy of Spaced Seeds<br />

RP(π)={i 1 = 0,i 2 ,···,i wπ = |π|−1}. (6.1)<br />

The seed π is said to hit R at position j if and only if R[ j −|π| + i k + 1] =1for<br />

all 1 ≤ k ≤ w π . Here, we use the ending position as the hit position following the<br />

convention in the renewal theory.<br />

Let A i denote the event that π hits R at position i and Ā i the complement of A i .<br />

We use π i to denote the probability that π first hits R at position i − 1, that is,<br />

π i = Pr[Ā 0 Ā 1 ···Ā i−2 A i−1 ].<br />

We call π i the first hit probability. Let Π n := Π n (p) denote the probability that π<br />

hits R[0,n − 1] and ¯Π n := 1 − Π n . We call Π n the hit probability and ¯Π n the non-hit<br />

probability of π.<br />

For each 0 ≤ n < |π|−1, trivially, A n = /0 and Π n = 0. Because the event<br />

∪ 0≤i≤n−1 A i is the disjoint union of the events<br />

∪ 0≤i≤n−2 A i<br />

and<br />

∪ 0≤i≤n−2 A i A n−1 = Ā 0 Ā 1 ···Ā n−2 A n−1 ,<br />

Π n = π 1 + π 2 + ···+ π n ,<br />

or equivalently,<br />

¯Π n = π n+1 + π n+2 + ··· (6.2)<br />

for n ≥|π|.<br />

Example 6.1. Let θ be the consecutive seed of weight w. Ifθ hits random sequence<br />

R at position n − 1, but not at position n − 2, then R[n − w,n − 1] =11···1 and<br />

R[n − w − 1]=0. As a result, θ cannot hit R at positions n − 3,n − 4,...,n − w − 1.<br />

This implies<br />

θ n = Pr[Ā 0 Ā 1 ···Ā n−2 A n−1 ]=p w (1 − p) ¯Θ n−w−1 , n ≥ w + 1.<br />

By formula (6.2), its non-hit probability satisfies the following recurrence relation<br />

¯Θ n = ¯Θ n−1 − p w (1 − p) ¯Θ n−w−1 . (6.3)<br />

Example 6.2. Let π be a spaced seed and k > 1. By inserting (k − 1) ∗’s between<br />

every two consecutive positions in π, we obtain a spaced seed π ′ of the same weight<br />

and length |π ′ | = k|π|−k + 1. It is not hard to see that π ′ hits the random sequence<br />

R[0,n − 1]=s[0]s[1]···s[n − 1] if and only if π hits one of the following k random<br />

sequences:<br />

s[i]s[k + i]s[2k + i]···s[lk+ i],<br />

i = 0,1,...,r,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!