01.04.2015 Views

Sequence Comparison.pdf

Sequence Comparison.pdf

Sequence Comparison.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

7.3 Gapped Local Alignment Scores 137<br />

A(s)= 1<br />

|I s | ∑ i∈I s<br />

(S(i) − s), (7.51)<br />

where S(i) is the score of island i and |I s | the number of islands in I s . Because<br />

the island scores are integral and have no proper common divisors in the cases of<br />

interest, the maximum-likelihood estimate for λ is<br />

By equation (7.45), K is calculated as<br />

λ s = ln(1 + 1 ). (7.52)<br />

A(s)<br />

K s = 1 V ×|I s|×e sλ s<br />

, (7.53)<br />

where V is the size of the search space from which the island scores are collected.<br />

V is equal to n 2 if the islands are obtained from aligning locally two sequences of<br />

length n; itisNn 2 if N such alignments were performed.<br />

Maximum-Likelihood Method for Estimating λ (Altschul et al., 2001, [5])<br />

The island scores S follow asymptotically a geometric-like distribution<br />

Pr[S = x]=De −λx ,<br />

where D is a constant. For a large integer cutoff c,<br />

Pr[S = x|S ≥ c] ≈<br />

De−λx<br />

∑ ∞ j=c De −λ j =(1 − e−λ )e −λ(x−c) .<br />

Let x i denote the ith island scores for i = 1,2,...,M. Then the logarithm of<br />

the probability that all x i shaveavalueofc or greater is<br />

ln(Pr[x 1 ,x 2 ,...,x M |x 1 ≥ c,x 2 ≥ c,...,x M ≥ c])<br />

= −λ<br />

M<br />

∑<br />

j=1<br />

(x j − c)+M ln(1 − e −λ ).<br />

The best value λ ML of λ is the one that maximizes this expression. By equating<br />

the first derivation of this expression to zero, we obtain that<br />

(<br />

)<br />

1<br />

λ ML = ln 1 +<br />

1<br />

M ∑M j=1 (x .<br />

j − c)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!