Algorithms and Data Structures for External Memory
Algorithms and Data Structures for External Memory
Algorithms and Data Structures for External Memory
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
5<br />
a b<br />
a<br />
3<br />
c<br />
a<br />
0<br />
b<br />
a<br />
4<br />
b<br />
6 6<br />
a b a b<br />
6 10 4 7 7 7 8<br />
abaaba<br />
abaabbabba<br />
abac<br />
bcbcaba<br />
bcbcabb<br />
14.2 String B-Trees 125<br />
Fig. 14.1 Patricia trie representation of a single node of an SB-tree, with branching factor<br />
B = 8. The seven strings used <strong>for</strong> partitioning are pictured at the leaves; in the actual data<br />
structure, pointers to the strings, not the strings themselves, are stored at the leaves. The<br />
pointers to the B children of the SB-tree node are also stored at the leaves.<br />
label <strong>for</strong> each of its outgoing edges. Navigation from root to leaf in<br />
the Patricia trie is done using the bit representation of the strings. For<br />
example, suppose we want to search <strong>for</strong> the leaf “abac.” We start at<br />
the root, which has index 0; the index indicates that we should examine<br />
character 0 of the search string “abac” (namely, “a”), which leads us to<br />
follow the branch labeled “a” (left branch). The next node we encounter<br />
has index 3, <strong>and</strong> so we examine character 3 (namely, “c”), follow the<br />
branch labeled “c” (right branch), <strong>and</strong> arrive at the leaf “abac.”<br />
Searching <strong>for</strong> a text string that does not match one of the leaves<br />
is more complicated <strong>and</strong> exploits the full power of the data structure,<br />
using an amortization technique of Ajtai et al. [25]. Suppose we want<br />
to search <strong>for</strong> “bcbabcba.” Starting at the root, with index 0, we examine<br />
character 0 of “bcbabcba” (namely, “b”) <strong>and</strong> follow the branch<br />
labeled “b” (right branch). We continue searching in this manner, which<br />
leads along the rightmost path, examining indexes 4 <strong>and</strong> 6, <strong>and</strong> eventually<br />
we arrive at the far-right leaf “bcbcbbba.” However, the search<br />
string “bcbabcba” does not match the leaf string “bcbcbbba.” The<br />
problem is that they differ at index 3, but only indexes 0, 4, <strong>and</strong> 6 were<br />
examined in the traversal, <strong>and</strong> thus the difference was not detected.<br />
In order to determine efficiently whether or not there is a match, we<br />
go back <strong>and</strong> sequentially compare the characters of the search string<br />
bcbcbba<br />
bcbcbbba