13.07.2015 Views

New Efficient Algorithms for LCS and Constrained LCS Problem

New Efficient Algorithms for LCS and Constrained LCS Problem

New Efficient Algorithms for LCS and Constrained LCS Problem

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>New</strong> <strong>Efficient</strong> <strong>Algorithms</strong> <strong>for</strong> <strong>LCS</strong> <strong>and</strong><strong>Constrained</strong> <strong>LCS</strong> <strong>Problem</strong>Costas S. Iliopoulos <strong>and</strong> M. Sohel RahmanAlgorithm Design GroupDepartment of Computer Science, King’s College London,Str<strong>and</strong>, London WC2R 2LS, Engl<strong>and</strong>{sohel,csi}@dcs.kcl.ac.ukhttp://www.dcs.kcl.ac.uk/adgAbstract. In this paper, we study the classic <strong>and</strong> well-studied longest common subsequence(<strong>LCS</strong>) problem <strong>and</strong> a recent variant of it namely constrained <strong>LCS</strong> (C<strong>LCS</strong>) problem. In C<strong>LCS</strong>,the computed <strong>LCS</strong> must also be a supersequence of a third given string. In this paper, we firstpresent an efficient algorithm <strong>for</strong> the traditional <strong>LCS</strong> problem that runs in O(R log log n + n)time, where R is the total number of ordered pairs of positions at which the two strings match<strong>and</strong> n is the length of the two given strings. Then, using this algorithm, we devise an algorithm<strong>for</strong> the C<strong>LCS</strong> problem having time complexity O(pR log log n + n) in the worst case, where pis the length of the third string. Note that, if R = o(n 2 ), our algorithm will per<strong>for</strong>m very wellbut, if R = O(n 2 ), then, due to the log log n term, our algorithms will behave slightly worsethan the existing algorithms.1 IntroductionThe longest common subsequence(<strong>LCS</strong>) problem is one of the classical <strong>and</strong> wellstudiedproblems in computer science having extensive applications in diverse areas.Perhaps, the strongest motivation <strong>for</strong> the <strong>LCS</strong> problem <strong>and</strong> variants thereof, to thisdate, comes from Computational Molecular Biology. The <strong>LCS</strong> problem is a commontask in DNA sequence analysis, <strong>and</strong> has applications to genetics <strong>and</strong> molecular biology.Variants of <strong>LCS</strong> problem have been used to study similarity in RNA secondarystructure [6, 15, 10, 17]. Very recently, Bereg <strong>and</strong> Zhu in [4], presented a new model<strong>for</strong> RNA multiple sequence structural alignment based on the longest common subsequence.Several other variants with applications in computational biology have beenstudied in the literature recently [18, 24, 25, 3, 2].In this paper, we study the traditional <strong>LCS</strong> problem along with an interesting<strong>and</strong> newer variant of it, namely the <strong>Constrained</strong> <strong>LCS</strong> problem (C<strong>LCS</strong>). In C<strong>LCS</strong>,the computed <strong>LCS</strong> must also be a supersequence of a third string (given) [25, 8, 3, 2].This problem also finds its motivation from bioin<strong>for</strong>matics: in the computation of thehomology of two biological sequences it is important to take into account a commonspecific or putative structure [25].The longest common subsequence problem <strong>for</strong> k strings (k > 2) was first shown tobe NP-hard [19] <strong>and</strong> later proved to be hard to be approximated [14]. The restrictedbut probably the more studied problem that deals with two strings has been studiedextensively [21, 26, 23, 22, 20, 13, 12, 11]. The classic dynamic programming solution to


<strong>LCS</strong> problem, invented by Wagner <strong>and</strong> Fischer [26], has O(n 2 ) worst case runningtime, where n is the length of the two strings. Masek <strong>and</strong> Paterson [20] improved thisalgorithm using the “Four-Russians” technique [1] to reduce the worst case runningtime to O(n 2 / log n) 1 . Since then not much improvement in terms of n can be found inthe literature. However, several algorithms exist with complexities depending on otherparameters. For example Myers in [22] <strong>and</strong> Nakatsu et al. in [23] presented an O(nD)algorithm where the parameter D is the simple Levenshtein distance between the twogiven strings [16]. Another interesting <strong>and</strong> perhaps more relevant parameter <strong>for</strong> thisproblem is R, where R is the total number of ordered pairs of positions at whichthe two strings match. Hunt <strong>and</strong> Szymanski [13] presented an algorithm runningin O((R + n) log n). They have also cited applications where R ∼ n <strong>and</strong> therebyclaimed that <strong>for</strong> these applications the algorithm would run in O(n log n) time. For acomprehensive comparison of the well-known algorithms <strong>for</strong> <strong>LCS</strong> problem <strong>and</strong> studyof their behaviour in various application environments the readers are referred to [5].The C<strong>LCS</strong> problem, on the other h<strong>and</strong>, was introduced quite recently by Tsaiin [25], where an algorithm was presented solving the problem in O(pn 4 ) time, wherep is the length of the third string which applies the constraint. Later, Chin et al. [8]<strong>and</strong> independently, Arslan <strong>and</strong> Eğecioğ [2, 3] presented improved algorithm with On 2 ptime <strong>and</strong> space complexity. In this paper we present two new algorithms: one <strong>for</strong> <strong>LCS</strong><strong>and</strong> one <strong>for</strong> the C<strong>LCS</strong> problem. In particular, we first devise an efficient algorithm<strong>for</strong> the traditional <strong>LCS</strong> problem that runs in O(R log log n + n) time. Then, usingthis algorithm we devise an algorithm <strong>for</strong> the C<strong>LCS</strong> problem having time complexityO(pR log log n + n) in the worst case. Note that, if R = o(n 2 ), our algorithm willper<strong>for</strong>m very well but, if R = O(n 2 ), then, due to the log log n term, our algorithmswill behave slightly worse than the existing algorithms.The rest of the paper is organized as follows. In Section 2, we present the preliminaryconcepts. Section 3 is devoted to a new O(R log log n+n) time algorithm <strong>for</strong> the<strong>LCS</strong> problem. In Section 4, using the algorithm of section 3, we devise a new efficientalgorithm to solve the C<strong>LCS</strong> problem, which runs in O(pR log log n+n) time. Finally,we briefly conclude in Section 5.2 PreliminariesSuppose we are given two strings X[1..n] = X[1] X[2] . . . X[n] <strong>and</strong> Y [1..n] = Y [1] Y [2]. . . Y [n]. A subsequence S[1..r] = S[1] S[2] ...S[r] of X is obtained by deleting n − rsymbols from X. A common subsequence of two strings X <strong>and</strong> Y , denoted CS(X, Y ),is a subsequence common to both X <strong>and</strong> Y . The longest common subsequence of X<strong>and</strong> Y , denoted <strong>LCS</strong>(X, Y ), is a common subsequence of maximum length. We denotethe length of <strong>LCS</strong>(X, Y ) by L(X, Y ).<strong>Problem</strong> “<strong>LCS</strong>”. Given 2 strings X <strong>and</strong> Y , we want to find out an <strong>LCS</strong>(X, Y ).1 Employing different techniques, the same worst case bound was achieved in [9]. In particular, <strong>for</strong> mosttexts, the achieved time complexity in [9] is O(hn 2 / log n), where h ≤ 1 is the entropy of the text.


Given two strings X[1..n] <strong>and</strong> Y [1..n] <strong>and</strong> a third string Z[1..p], a CS(X, Y ) is saidto be constrained by Z if, <strong>and</strong> only if, Z is a subsequence of that CS(X, Y ). We useCS Z (X, Y ) to denote a common subsequence of X <strong>and</strong> Y constrained by Z. Then, thelongest common subsequence of X <strong>and</strong> Y , constrained by Z, is a CS Z (X, Y ) of maximumlength <strong>and</strong> is denoted by <strong>LCS</strong> Z (X, Y ). We denote the length of <strong>LCS</strong> Z (X, Y )by L Z (X, Y ). It is easy to note that L Z (X, Y ) ≤ L(X, Y ).<strong>Problem</strong> “C<strong>LCS</strong>”. Given 2 strings X <strong>and</strong> Y <strong>and</strong> another string Z, we want tofind an <strong>LCS</strong> Z (X, Y ).X =T C C A C AX =T C C A C AY = A C C A A GY = A C C A A GZ = A CFig. 1. <strong>LCS</strong>(X, Y ) <strong>and</strong> <strong>LCS</strong> Z (X, Y ) of Example 1.Example 1. Suppose X = T CCACA, Y = ACCAAG <strong>and</strong> Z = AC. As is evidentfrom Fig. 1, S 1 = CCAA is an <strong>LCS</strong>(X, Y ). However, S 1 is not an <strong>LCS</strong> Z (X, Y ) becauseZ is not a subsequence of S 1 . On the other h<strong>and</strong>, S 2 = ACA is an <strong>LCS</strong> Z (X, Y ).Note carefully that, in this case L Z (X, Y ) < L(X, Y ).In this paper we use the following notions. We say a pair (i, j), 1 ≤ i, j ≤ n, definesa match, if X[i] = Y [j]. The set of all matches, M, is defined as follows:M = {(i, j) | X[i] = Y [j], 1 ≤ i, j ≤ n}We define |M| = R.In what follows we assume that |X| = |Y | = n. But our results can be easilyextended when |X| ̸= |Y |.3 A new Algorithm <strong>for</strong> <strong>Problem</strong> <strong>LCS</strong>In this section, we present a new efficient algorithm to solve <strong>Problem</strong> <strong>LCS</strong> combiningan interesting data structure of [7] with the techniques of [24]. The resulting algorithmwill be applied to get an efficient algorithm <strong>for</strong> <strong>Problem</strong> C<strong>LCS</strong> later in Section 4. Wefollow the dynamic programming <strong>for</strong>mulation given in [24] <strong>and</strong> improve its runningtime of O(n 2 + R) to O(R log log n + n). The dynamic programming <strong>for</strong>mulationfrom [24] is as follows:


⎧Undefined if (i, j) /∈ M,⎪⎨1 if (i = 1 or j = 1) <strong>and</strong> (i, j) ∈ MT [i, j] =1 + max 1≤li j) wemust have T [i ′ ][j] ≥ T [i][j] (resp. T [i][j ′ ] ≥ T [i][j]).Fact 2. ([24]) The calculation of a T [i][j], (i, j) ∈ M, 1 ≤ i, j ≤ n is independent ofany T [l][q], (l, q) ∈ M, l = i, 1 ≤ q ≤ n.Along with the above two facts, we use the ‘BoundedHeap’ data structure of [7] thatsupports the following operations:Insert(H, pos, val, data): Insert into the BoundedHeap H the position pos with valueval <strong>and</strong> associated in<strong>for</strong>mation data.IncreaseV alue(H, pos, val, data): If H does not already contain the position pos,per<strong>for</strong>m Insert(H, pos, val, data). Otherwise, set this position’s value to max{val, val ′ },where val ′ is its previous value. Also, change the ‘data’ field accordingly.BoundedMax(H, pos): Return the item (with additional data) that has maximumvalue among all items in H with position smaller than pos. If H does not containany items with position smaller than pos, return 0.The following theorem from [7] presents the time complexity of the BoundedHeapdata structure.Theorem 1. ([7]) BoundedHeap data structure can support each of the above operationsin O(log log n) amortized time, where keys are drawn from the set {1, . . . , n}.The data structure requires O(n) space.The algorithm Alg<strong>LCS</strong><strong>New</strong> proceeds as follows. Recall that we per<strong>for</strong>m a rowby row operation. We always deal with two BoundedHeap data structures. Whileconsidering row i, we already have the BoundedHeap data structure H i−1 at ourh<strong>and</strong>; now we construct the BoundedHeap data structure H i . At first H i is initialized2 In [24], the running time of this preprocessing step is reported as O(R log log n) under the usual assumptionthat R ≥ n.


to H i−1 . Assume that we are considering the match (i, j) ∈ M i , 1 ≤ j ≤ n, whereM i = {(i, j) | (i, j) ∈ M, 1 ≤ j ≤ n}. We calculate T [i, j] as follows:T [i, j].Value = BoundedMax(H i−1 , j).Value + 1.T [i, j].Prev = BoundedMax(H i−1 , j).data 3 .Then we per<strong>for</strong>m the following operation:IncreaseV alue(H i , j, T [i, j].Value, (i, j)).The correctness of the above procedure follows from Fact 1 <strong>and</strong> 2. Due to Fact 1,as soon as we compute the T -value of a new match in a column j, we can <strong>for</strong>get aboutthe previous matches of that column. So, as soon as we compute T [i, j] in row i, weinsert it in H i to update it <strong>for</strong> the next row i.e. i + 1. And due to Fact 2, we can useH i−1 <strong>for</strong> the computation of the T -values of the matches in row i <strong>and</strong> do the updatein H i (initialized at first to H i−1 ) to make H i ready <strong>for</strong> row i + 1.Next, we analyze the running time of Alg<strong>LCS</strong><strong>New</strong>. The preprocessing requiresO(R log log n + n) time to get the set M in the required order [24]. Then we calculateeach (i, j) ∈ M according to Equation 1 using the BoundedHeap data structure.Note carefully that we need to use the two operations, BoundedMax() <strong>and</strong>IncreaseV alue(), once each <strong>for</strong> each of the matches in M. So, in total the runningtime is O(R log log n + n). The space requirement is as follows. The preprocessingstep requires θ(max{R, n}) space [24]. And, in the main algorithm, we only needto keep track of two BoundedHeap data structures at a time, requiring O(n) space(Theorem 1). So, in total the space requirement is θ(max{R, n}).Theorem 2. <strong>Problem</strong> <strong>LCS</strong> can be solved in O(R log log n+n) time requiring θ(max{R, n})space.Algorithm 1 <strong>for</strong>mally presents the algorithm. Since we are not calculating all theentries of the table, we, off course, need to use variables to keep track of the actual<strong>LCS</strong>.4 C<strong>LCS</strong> AlgorithmIn this section, we present the new algorithm <strong>for</strong> <strong>Problem</strong> C<strong>LCS</strong>. We use the dynamicprogramming <strong>for</strong>mulation <strong>for</strong> C<strong>LCS</strong> presented in [2, 3]. Extending our tabularnotion from Equation 1, we use T [i, j, k], 1 ≤ i, j ≤ n, 0 ≤ k ≤ p to denoteL Z[1..k] (X[1..i], Y [1..j]). We have the following <strong>for</strong>mulation <strong>for</strong> <strong>Problem</strong> C<strong>LCS</strong> from [2,3].T [i, j, k] = max{T ′ [i, j, k], T ′′ [i, j, k], T [i, j − 1, k], T [i − 1, j, k]} (2)3 Note that, we use the ‘data’ field to keep the ‘address’ of the match.


Algorithm 1 Alg<strong>LCS</strong><strong>New</strong>1: Construct the set M using Algorithm 1 of [24]. Let M i = {(i, j) | (i, j) ∈ M, 1 ≤j ≤ n}.2: global<strong>LCS</strong>.Instance = ɛ3: global<strong>LCS</strong>.Value = ɛ4: H 0 = ɛ5: <strong>for</strong> i = 1 to n do6: H i = H i−17: <strong>for</strong> each (i, j) ∈ M i do8: maxresult = BoundedMax(H i−1 , j)9: T [i, j].Value = maxresult.Value + 110: T [i, j].Prev = maxresult.Instance11: if global<strong>LCS</strong>.Value < T [i, j].Value then12: global<strong>LCS</strong>.Value = T [i, j].Value13: global<strong>LCS</strong>.Instance = (i, j)14: end if15: IncreaseV alue(H i , j, T [i, j].Value, (i, j)).16: end <strong>for</strong>17: Delete H i−1 .18: end <strong>for</strong>19: return global<strong>LCS</strong>where⎧1 + T [i − 1, j − 1, k − 1] if(k = 1 or⎪⎨T ′ (k > 1 <strong>and</strong> T [i − 1, j − 1, k − 1] > 0))[i, j, k] =<strong>and</strong> X[i] = Y [j] = Z[k].⎪⎩0 otherwise.(3)<strong>and</strong>⎧⎪⎨ 1 + T [i − 1, j − 1, k] if (k = 0 or T [i − 1, j − 1, k] > 0)T ′′ [i, j, k] =<strong>and</strong> X[i] = Y [j].⎪⎩0 otherwise.(4)The following boundary conditions are assumed in Equations 2 to 4:T [i, 0, k] = T [0, j, k] = 0, 0 ≤ i, j ≤ n, 0 ≤ k ≤ p.It is straight<strong>for</strong>ward to give a O(pn 2 ) algorithm realizing the dynamic programming<strong>for</strong>mulation presented in Equations 2 to 4. Our goal is to present a new efficient algorithmusing the parameter R. In line of Equation 1, we can re<strong>for</strong>mulate Equations 3<strong>and</strong> 4 as follows:


V 1 = max {T [l i , l j , k − 1]} (5)1≤l i 0))[i, j, k] =(6)<strong>and</strong> X[i] = Y [j] = Z[k].⎪⎩0 otherwise.V 2 = max {T [l i , l j , k]} (7)1≤l i


Algorithm 2 AlgC<strong>LCS</strong><strong>New</strong>1: Construct the set M using Algorithm 1 of [24]. Let M i = (i, j) ∈ M, 1 ≤ j ≤ n.2: globalC<strong>LCS</strong>.Instance = ɛ3: globalC<strong>LCS</strong>.Value = ɛ4: H −10 = ɛ5: H 0 0 = ɛ6: <strong>for</strong> i = 1 to n do7: <strong>for</strong> k = 0 to p do8:9:Hi k = Hi−1k<strong>for</strong> each (i, j) ∈ M i do10: max 1 = BoundedMax(H k−1i−1 , j)11: max 2 = BoundedMax(Hi−1, k j){First, we consider Equations 7 <strong>and</strong> 8}12: Val 2 .Value = 013: Val 2 .Instance = ɛ14: if ((k = 0) or (max 2 .Value > 0)) then15: Val 2 = max 216: end if{Now, we consider Equations 5 <strong>and</strong> 6}17: if (X[i] = Z[k]){This means X[i] = Y [j] = Z[k]} then18: Val 1 .Value = 019: Val 1 .Instance = ɛ20: if (k = 1) or ((k > 1) <strong>and</strong> (max 1 .V alue > 0)) then21: Val 1 = max 122: end if23: end if{Finally, we consider Equation 2}24: maxresult = max{Val 1 , Val 2 }25: T k [i, j].Value = maxresult.Value + 126: T k [i, j].Prev = maxresult.Instance27: if global<strong>LCS</strong>.Value < T k [i, j].Value then28: global<strong>LCS</strong>.Value = T k [i, j].Value29: global<strong>LCS</strong>.Instance = (i, j, k)30: end if31: IncreaseValue(Hi k , j, T [i, j].Value, (i, j, k)).32: end <strong>for</strong>33: Delete H k−1i−1 .34: end <strong>for</strong>35: end <strong>for</strong>36: return global<strong>LCS</strong>


– We update H k i , which is initialized to H k i−1 when we start row i. H k i is later usedwhen we consider the matches of row i + 1 of the two matrices T k <strong>and</strong> T k+1Due to Fact 4, while computing a T k [i][j], (i, j) ∈ M, 1 ≤ i, j ≤ n, 1 ≤ k ≤ p, we canemploy the same technique with Hi k , Hi−1 k <strong>and</strong> H k−1i−1 used in Alg<strong>LCS</strong><strong>New</strong> (with H i<strong>and</strong> H i−1 to compute T [i, j], (i, j) ∈ M, 1 ≤ j ≤ n). On the other h<strong>and</strong>, courtesyto Fact 3, to realize Equations 5−8, we need only keep track of H k−1i−1 <strong>and</strong> Hk i−1. Thesteps are <strong>for</strong>mally stated in Algorithm 2. In light of the analysis of Alg<strong>LCS</strong><strong>New</strong>, it isquite easy to see that the running time is O(pR log log n + n). The space complexity,like Alg<strong>LCS</strong><strong>New</strong>, is dominated by the preprocessing step of [24] because in the mainalgorithm, we only need to keep track of 3 BoundedHeap data structures <strong>and</strong> thethird string Z, requiring in total O(n) space.Theorem 3. <strong>Problem</strong> C<strong>LCS</strong> can be solved in O(pR log log n + n) time requiringθ(max{R, n}) space.5 ConclusionIn this paper, we have studied the classic <strong>and</strong> well-studied longest common subsequence(<strong>LCS</strong>) problem <strong>and</strong> a recent variant of it namely constrained <strong>LCS</strong> (C<strong>LCS</strong>)problem. In <strong>LCS</strong>, given two strings, we want to find the common subsequence havingthe highest length; in C<strong>LCS</strong>, in addition to that, the solution to the problemmust also be a supersequence of a third given string. The traditional <strong>LCS</strong> problemhas extensive applications in diverse areas of computer science including DNA sequenceanalysis, genetics <strong>and</strong> molecular biology. C<strong>LCS</strong>, a relatively new variant of<strong>LCS</strong>, gets its motivation from computational bio-in<strong>for</strong>matics as well. In this paper,we have presented an efficient algorithm <strong>for</strong> the traditional <strong>LCS</strong> problem that runsin O(R log log n + n) time. Then, using this algorithm, we have devised an algorithm<strong>for</strong> the C<strong>LCS</strong> problem having time complexity O(pR log log n + n) in the worst case.Note that, if R = o(n 2 ), our algorithm will per<strong>for</strong>m very well but, if R = O(n 2 ), then,due to the log log n term, our algorithms will behave slightly worse than the existingalgorithms. It would be interesting to see whether our techniques can be employed toother variants of <strong>LCS</strong> problem to devise similar efficient algorithms.References1. V. Arlazarov, E. Dinic, M. Kronrod, <strong>and</strong> I. Faradzev. On economic construction of the transitive closureof a directed graph (english translation). Soviet Math. Dokl., 11:1209–1210, 1975.2. A. N. Arslan <strong>and</strong> Ö. Eğecioğlu. <strong>Algorithms</strong> <strong>for</strong> the constrained longest common subsequence problems.In M. Simánek <strong>and</strong> J. Holub, editors, The Prague Stringology Conference, pages 24–32. Department ofComputer Science <strong>and</strong> Engineering, Faculty of Electrical Engineering, Czech Technical University, 2004.3. A. N. Arslan <strong>and</strong> Ö. Eğecioğlu. <strong>Algorithms</strong> <strong>for</strong> the constrained longest common subsequence problems.Int. J. Found. Comput. Sci., 16(6):1099–1109, 2005.4. S. Bereg <strong>and</strong> B. Zhu. RNA multiple structural alignment with longest common subsequences. InL. Wang, editor, Computing <strong>and</strong> Combinatorics (COCOON), volume 3595 of Lecture Notes in ComputerScience, pages 32–41. Springer, 2005.


5. L. Bergroth, H. Hakonen, <strong>and</strong> T. Raita. A survey of longest common subsequence algorithms. In StringProcessing <strong>and</strong> In<strong>for</strong>mation Retrieval (SPIRE), pages 39–48. IEEE Computer Society, 2000.6. G. Blin, G. Fertin, R. Rizzi, <strong>and</strong> S. Vialette. What makes the arc-preserving subsequence problem hard?In V. S. Sunderam, G. D. van Albada, P. M. A. Sloot, <strong>and</strong> J. Dongarra, editors, International Conferenceon Computational Science, volume 3515 of Lecture Notes in Computer Science, pages 860–868. Springer,2005.7. G. S. Brodal, K. Kaligosi, I. Katriel, <strong>and</strong> M. Kutz. Faster algorithms <strong>for</strong> computing longest commonincreasing subsequences. In M. Lewenstein <strong>and</strong> G. Valiente, editors, CPM, volume 4009 of Lecture Notesin Computer Science, pages 330–341. Springer, 2006.8. F. Y. L. Chin, A. D. Santis, A. L. Ferrara, N. L. Ho, <strong>and</strong> S. K. Kim. A simple algorithm <strong>for</strong> theconstrained sequence problems. Inf. Process. Lett., 90(4):175–179, 2004.9. M. Crochemore, G. M. L<strong>and</strong>au, <strong>and</strong> M. Ziv-Ukelson. A sub-quadratic sequence alignment algorithm <strong>for</strong>unrestricted cost matrices. In Symposium of Discrete <strong>Algorithms</strong> (SODA), pages 679–688, 2002.10. P. Evans. <strong>Algorithms</strong> <strong>and</strong> complexity <strong>for</strong> annotated sequence analysis. Ph.D Thesis, University ofVictoria, 1999.11. F. Hadlock. Minimum detour methods <strong>for</strong> string or sequence comparison. Congressus Numerantium,61:263–274, 1988.12. D. S. Hirschberg. <strong>Algorithms</strong> <strong>for</strong> the longest common subsequence problem. J. ACM, 24(4):664–675,1977.13. J. W. Hunt <strong>and</strong> T. G. Szymanski. A fast algorithm <strong>for</strong> computing longest subsequences. Commun.ACM, 20(5):350–353, 1977.14. T. Jiang <strong>and</strong> M. Li. On the approximation of shortest common supersequences <strong>and</strong> longest commonsubsequences. SIAM Journal of Computing, 24(5):1122–1139, 1995.15. T. Jiang, G. Lin, B. Ma, <strong>and</strong> K. Zhang. The longest common subsequence problem <strong>for</strong> arc-annotatedsequences. In R. Giancarlo <strong>and</strong> D. Sankoff, editors, Combinatorial Pattern Matching (CPM), volume1848 of Lecture Notes in Computer Science, pages 154–165. Springer, 2000.16. V. Levenshtein. Binary codes capable of correcting deletions, insertions, <strong>and</strong> reversals. <strong>Problem</strong>s inIn<strong>for</strong>mation Transmission, 1:8–17, 1965.17. G.-H. Lin, Z.-Z. Chen, T. Jiang, <strong>and</strong> J. Wen. The longest common subsequence problem <strong>for</strong> sequenceswith nested arc annotations. J. Comput. Syst. Sci., 65(3):465–480, 2002.18. B. Ma <strong>and</strong> K. Zhang. On the longest common rigid subsequence problem. In A. Apostolico,M. Crochemore, <strong>and</strong> K. Park, editors, CPM, volume 3537 of Lecture Notes in Computer Science, pages11–20. Springer, 2005.19. D. Maier. The complexity of some problems on subsequences <strong>and</strong> supersequences. Journal of the ACM,25(2):322–336, 1978.20. W. J. Masek <strong>and</strong> M. Paterson. A faster algorithm computing string edit distances. J. Comput. Syst.Sci., 20(1):18–31, 1980.21. V. Mkinen, G. Navarro, <strong>and</strong> E. Ukkonen. Transposition invariant string matching. Journal of <strong>Algorithms</strong>,56:124–153, 2005.22. E. W. Myers. An o(nd) difference algorithm <strong>and</strong> its variations. Algorithmica, 1(2):251–266, 1986.23. N. Nakatsu, Y. Kambayashi, <strong>and</strong> S. Yajima. A longest common subsequence algorithm suitable <strong>for</strong>similar text strings. Acta Inf., 18:171–179, 1982.24. M. S. Rahman <strong>and</strong> C. S. Iliopoulos. <strong>Algorithms</strong> <strong>for</strong> computing variants of the longest common subsequenceproblem. In ISAAC, pages 399–408, 2006.25. Y.-T. Tsai. The constrained longest common subsequence problem. Inf. Process. Lett., 88(4):173–176,2003.26. R. A. Wagner <strong>and</strong> M. J. Fischer. The string-to-string correction problem. J. ACM, 21(1):168–173, 1974.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!