Vol 9 No1 - Journal of Cell and Molecular Biology - Haliç Üniversitesi
Vol 9 No1 - Journal of Cell and Molecular Biology - Haliç Üniversitesi
Vol 9 No1 - Journal of Cell and Molecular Biology - Haliç Üniversitesi
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
48 Arumugam KUNTHAVAI <strong>and</strong> Somasundaram VASANTHA RATHNA<br />
array. The lcp-interval tree <strong>of</strong> S = acaaacatat$ is<br />
shown in Figure 2.<br />
Figure 2. The Lcp-interval tree <strong>of</strong> S = acaaacatat$<br />
An interval [i...j], where 0 ≤ i ≤ j ─ n, in an Lcparray<br />
is called an Lcp-interval<br />
<strong>of</strong> Lcp-value ℓ<br />
(denoted by ℓ-[i...j]) if<br />
Lcptab[i] < ℓ<br />
Lcptab[k] ≤ ℓ for all k with i+1 ≤ k ≤ j<br />
Lcptab[k] = ℓ for at least one k with<br />
i+1 ≤ k ≤ j<br />
Lcptab [j + 1] < ℓ<br />
Every index k, i+1 ≤ k ≤ j, with Lcptab[k] = Ssuftab<br />
is called ℓ index. The set <strong>of</strong> all ℓ indices <strong>of</strong> an ℓ<br />
interval [i...j] will be denoted by ℓ Indices (i...j). If<br />
[i...j] is an ℓ-interval such that ω =<br />
S[suftab[i]..Suftab[i]+ ℓ -1] is the longest common<br />
prefix <strong>of</strong> the suffixes Ssuftab[i]; Ssuftab[i+1]; … ;<br />
Suftab[j], then [i...j] is also called ω-interval. Based<br />
on the analogy between the suffix array <strong>and</strong> the<br />
suffix tree, it is desirable to enhance the suffix array<br />
with additional information to determine, for any ℓ-<br />
interval [i..j], all its child intervals in constant time<br />
using enhancing the suffix array with two tables.<br />
Enhanced Suffix array<br />
The new data structure consists <strong>of</strong> the suffix array,<br />
the Lcp-interval table, <strong>and</strong> an additional<br />
table: the<br />
child-table cldtab shown in Table 2.<br />
The child-table is a table <strong>of</strong> size n+1 indexed<br />
from 0 to n <strong>and</strong> each entry contains three values:<br />
up, down, <strong>and</strong> nextℓIndex. Each <strong>of</strong> these three<br />
values requires 4 bytes in the worst case. The<br />
values <strong>of</strong> each cldtab-entry are defined as follows<br />
(it is assumed that min Φ = max Φ = 1):<br />
1. cldtab[i].up =<br />
Min {q Є [0..i - 1] | Lcptab[q] > Lcptab[i]<br />
<strong>and</strong> for all k Є [q + 1..i - 1] :<br />
Lcptab[k] ≥ Lcptab[q]}<br />
2. cldtab[i].down<br />
=<br />
Max {q Є [i + 1.. n] | Lcptab[q] > Lcptab[i]<br />
<strong>and</strong> for all k Є [i + 1..q - 1] : Lcptab[k] ≥<br />
Lcptab[q]}<br />
3. cldtab[i].next ℓ Index =<br />
Min {q Є [i + 1.. n] | Lcptab[q] = Lcptab[ i]<br />
<strong>and</strong> for all k Є [i + 1..q - 1] : Lcptab[k] ><br />
Lcptab[i]}<br />
The child-table stores the parent-child<br />
relationship <strong>of</strong> Lcp-intervals. For an ℓ -interval<br />
[i...j] whose ℓ -indices are i1 < i2