01.04.2015 Views

Sequence Comparison.pdf

Sequence Comparison.pdf

Sequence Comparison.pdf

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

30 2 Basic Algorithmic Techniques<br />

and<br />

〈P,R,E,S,I,D,E,N,T〉<br />

〈P,R,O,V,I,D,E,N,C,E〉,<br />

〈P,R,D,N〉 is a common subsequence of them, whereas 〈P,R,V 〉 is not. Their LCS<br />

is 〈P,R,I,D,E,N〉.<br />

Now let us formulate the recurrence for computing the length of an LCS of two<br />

sequences. We are given two sequences A = 〈a 1 ,a 2 ,...,a m 〉, and B = 〈b 1 ,b 2 ,...,b n 〉.<br />

Let len[i, j] denote the length of an LCS between 〈a 1 ,a 2 ,...,a i 〉 (a prefix of A) and<br />

〈b 1 ,b 2 ,...,b j 〉 (a prefix of B). They can be computed by the following recurrence:<br />

⎧<br />

⎨ 0 if i = 0or j = 0,<br />

len[i, j]= len[i − 1, j − 1]+1 ifi, j > 0 and a i = b j ,<br />

⎩<br />

max{len[i, j − 1],len[i − 1, j]} otherwise.<br />

In other words, if one of the sequences is empty, the length of their LCS is just<br />

zero. If a i and b j are the same, an LCS between 〈a 1 ,a 2 ,...,a i 〉, and 〈b 1 ,b 2 ,...,b j 〉<br />

is the concatenation of an LCS of 〈a 1 ,a 2 ,...,a i−1 〉 and 〈b 1 ,b 2 ,...,b j−1 〉 and a i .<br />

Therefore, len[i, j] = len[i − 1, j − 1]+1 in this case. If a i and b j are different, their<br />

LCS is equal to either an LCS of 〈a 1 ,a 2 ,...,a i 〉, and 〈b 1 ,b 2 ,...,b j−1 〉, or that of<br />

〈a 1 ,a 2 ,...,a i−1 〉, and 〈b 1 ,b 2 ,...,b j 〉. Its length is thus the maximum of len[i, j − 1]<br />

and len[i − 1, j].<br />

Figure 2.11 gives the pseudo-code for computing len[i, j]. For each entry (i, j),<br />

we retain the backtracking information in prev[i, j]. Iflen[i − 1, j − 1] contributes<br />

the maximum value to len[i, j], then we set prev[i, j]=“↖.” Otherwise prev[i, j] is<br />

set to be “↑” or“←” depending on which one of len[i − 1, j] and len[i, j − 1] contributes<br />

the maximum value to len[i, j]. Whenever there is a tie, any one of them will<br />

Algorithm LCS LENGTH(A = 〈a 1 ,a 2 ,...,a m 〉, B = 〈b 1 ,b 2 ,...,b n 〉)<br />

begin<br />

for i ← 0 to m do len[i,0] ← 0<br />

for j ← 1 to n do len[0, j] ← 0<br />

for i ← 1 to m do<br />

for j ← 1 to n do<br />

if a i = b j then<br />

len[i, j] ← len[i − 1, j − 1]+1<br />

prev[i, j] ←“↖”<br />

else if len[i − 1, j] ≥ len[i, j − 1] then<br />

len[i, j] ← len[i − 1, j]<br />

prev[i, j] ←“↑”<br />

else<br />

len[i, j] ← len[i, j − 1]<br />

prev[i, j] ←“←”<br />

return len and prev<br />

end<br />

Fig. 2.11 Computation of the length of an LCS of two sequences.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!