01.04.2015 Views

Sequence Comparison.pdf

Sequence Comparison.pdf

Sequence Comparison.pdf

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

114 6 Anatomy of Spaced Seeds<br />

Table 6.5 The hit probability of the best spaced seeds on a region of length 64 for each weight and<br />

a similarity level. This is reproduced from the paper [48] of Choi, Zeng, and Zhang; reprinted with<br />

permission of Oxford University Press.<br />

W<br />

Best spaced seeds<br />

Similarity<br />

(%)<br />

Sensitivity<br />

9 11*11*1*1***111 65<br />

70<br />

80<br />

90<br />

10 11*11***11*1*111 65<br />

70<br />

80<br />

90<br />

11 111*1**1*1**11*111<br />

65<br />

70<br />

111**1*11**1*1*111<br />

80<br />

90<br />

12 111*1*11*1**11*111 65<br />

70<br />

80<br />

90<br />

13 111*11*11**1*1*1111<br />

111*1*11**11**1*1111<br />

111*1**11*1**111*111<br />

14 1111*1*1*11**11*1111<br />

111*111**1*11**1*1111<br />

65<br />

70<br />

80<br />

90<br />

65<br />

70<br />

80<br />

90<br />

0.52215<br />

0.72916<br />

0.97249<br />

0.99991<br />

0.38093<br />

0.59574<br />

0.93685<br />

0.99957<br />

0.26721<br />

0.46712<br />

0.88240<br />

0.99848<br />

0.18385<br />

0.35643<br />

0.81206<br />

0.99583<br />

0.12327<br />

0.26475<br />

0.73071<br />

0.99063<br />

0.08179<br />

0.19351<br />

0.66455<br />

0.98168<br />

Empirical studies show that transition seeds have a good trade-off between sensitivity<br />

and specificity for homology search in both coding and non-coding regions.<br />

6.6.2 Multiple Spaced Seeds<br />

The idea of spaced seeds is equivalent to using multiple similar word patterns to<br />

increase the sensitivity. Naturally, multiple spaced seeds are employed to optimize<br />

the sensitivity. In this case, a set of spaced seeds are selected first; then all the hits<br />

generated by these seeds are examined to produce local alignments.<br />

The empirical studies by several research groups show that doubling the number<br />

of seeds gets better sensitivity than reducing the single seed by one bit. Moreover,<br />

for DNA homology search, the former roughly doubles the number of hits, whereas

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!