12.07.2015 Views

A Practical Introduction to Data Structures and Algorithm Analysis

A Practical Introduction to Data Structures and Algorithm Analysis

A Practical Introduction to Data Structures and Algorithm Analysis

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

452 Chap. 13 Advanced Tree <strong>Structures</strong>Example 13.1 When searching for the value 7 (0000111 in binary) inthe PAT trie of Figure 13.3, the root node indicates that bit position 0 (theleftmost bit) is checked first. Because the 0th bit for value 7 is 0, take theleft branch. At level 1, branch depending on the value of bit 1, which againis 0. At level 2, branch depending on the value of bit 2, which again is 0. Atlevel 3, the index s<strong>to</strong>red in the node is 4. This means that bit 4 of the key ischecked next. (The value of bit 3 is irrelevant, because all values s<strong>to</strong>red inthat subtree have the same value at bit position 3.) Thus, the single branchthat extends from the equivalent node in Figure 13.1 is just skipped. Forkey value 7, bit 4 has value 1, so the rightmost branch is taken. Becausethis leads <strong>to</strong> a leaf node, the search key is compared against the key s<strong>to</strong>redin that node. If they match, then the desired record has been found.Note that during the search process, only a single bit of the search key is comparedat each internal node. This is significant, because the search key could bequite large. Search in the PAT trie requires only a single full-key comparison,which takes place once a leaf node has been reached.Example 13.2 Consider the situation where we need <strong>to</strong> s<strong>to</strong>re a library ofDNA sequences. A DNA sequence is a series of letters, usually many thous<strong>and</strong>sof characters long, with the string coming from an alphabet of onlyfour letters that st<strong>and</strong> for the four amino acids making up a DNA str<strong>and</strong>.Similar DNA sequences might have long sections of their string that areidentical. The PAT trie would avoid making multiple full key comparisonswhen searching for a specific sequence.13.2 Balanced TreesWe have noted several times that the BST has a high risk of becoming unbalanced,resulting in excessively expensive search <strong>and</strong> update operations. One solution <strong>to</strong>this problem is <strong>to</strong> adopt another search tree structure such as the 2-3 tree. An alternativeis <strong>to</strong> modify the BST access functions in some way <strong>to</strong> guarantee that thetree performs well. This is an appealing concept, <strong>and</strong> it works well for heaps,whose access functions maintain the heap in the shape of a complete binary tree.Unfortunately, requiring that the BST always be in the shape of a complete binarytree requires excessive modification <strong>to</strong> the tree during update, as discussed in Section10.3.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!