23.11.2014 Views

Data Structures and Algorithms in Java[1].pdf - Fulvio Frisone

Data Structures and Algorithms in Java[1].pdf - Fulvio Frisone

Data Structures and Algorithms in Java[1].pdf - Fulvio Frisone

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

• T has s external nodes.<br />

• The number of nodes of T is O(s).<br />

The attentive reader may wonder whether the compression of paths provides any<br />

significant advantage, s<strong>in</strong>ce it is offset by a correspond<strong>in</strong>g expansion of the node<br />

labels. Indeed, a compressed trie is truly advantageous only when it is used as an<br />

auxiliary <strong>in</strong>dex structure over a collection of str<strong>in</strong>gs already stored <strong>in</strong> a primary<br />

structure, <strong>and</strong> is not required to actually store all the characters of the str<strong>in</strong>gs <strong>in</strong> the<br />

collection.<br />

Suppose, for example, that the collection S of str<strong>in</strong>gs is an array of str<strong>in</strong>gs S[0],<br />

S[1], …, S[s − 1]. Instead of stor<strong>in</strong>g the label X of a node explicitly, we represent it<br />

implicitly by a triplet of <strong>in</strong>tegers (i, j, k), such that X = S[i][j..k]; that is, X is the<br />

substr<strong>in</strong>g of S[i] consist<strong>in</strong>g of the characters from the jth to the kth <strong>in</strong>cluded. (See<br />

the example <strong>in</strong> Figure 12.9. Also compare with the st<strong>and</strong>ard trie of Figure 12.7.)<br />

Figure 12.9: (a) Collection S of str<strong>in</strong>gs stored <strong>in</strong> an<br />

array. (b) Compact representation of the compressed<br />

trie for S.<br />

This additional compression scheme allows us to reduce the total space for the trie<br />

itself from O(n) for the st<strong>and</strong>ard trie to O(s) for the compressed trie, where n is the<br />

total length of the str<strong>in</strong>gs <strong>in</strong> S <strong>and</strong> s is the number of str<strong>in</strong>gs <strong>in</strong> S. We must still store<br />

the different str<strong>in</strong>gs <strong>in</strong> S, of course, but we nevertheless reduce the space for the<br />

768

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!