15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

TABLE 32.3 Best Known I/O Bounds for Batched Graph Problems for the Single-Disk Case D = 1<br />

Graph Problem I/O bound, D = 1<br />

List ranking, Euler tour of a tree, centroid<br />

decomposition, expression tree evaluation<br />

Connected components, minimum spanning<br />

forest (MSF)<br />

Bottleneck MSF, biconnected components<br />

Ear decomposition, maximal matching<br />

Undirected breadth-first search<br />

Undirected single-source shortest paths<br />

Directed and undirected depth-first search,<br />

topological sorting, directed breadth-first<br />

search, directed single-source shortest paths<br />

Transitive closure<br />

Note: The number of vertices is denoted by V = υB and the number of edges by E = eB. The terms Sort (N) and<br />

BundleSort (N, K) are defined in Sections 32.4 and 32.6.<br />

32.10 External Hashing for Online Dictionary Search<br />

This section focuses on online data structures for supporting the dictionary operations of insert, delete,<br />

and lookup. Given a value x, the lookup operation returns the item(s), if any, in the structure with key<br />

value x. The two main types of EM dictionaries are hashing, which we discuss in this section, and treebased<br />

approaches, which is deferred until Section 32.11. The advantage of hashing is that the expected<br />

number of probes per operation is a constant, regardless of the number N of items. The common element<br />

of all EM hashing algorithms is a predefined hash function:<br />

© 2002 by CRC Press LLC<br />

Θ( Sort( V)<br />

) [54]<br />

⎧ V<br />

O max 1, log ⎛log-- ⎞⎫<br />

⎨ ⎝ e ⎠⎬<br />

⎩ ⎭<br />

E<br />

⎛ --- ⎞<br />

⎜ V Sort( V ) ⎟<br />

⎝ ⎠<br />

[20, 77, 151] (deterministic)<br />

E<br />

Θ ⎛--- V Sort( V ) ⎞<br />

⎝ ⎠<br />

[54] (randomized)<br />

O min V 2 ⎧ V<br />

, max 1, log---- ⎫<br />

⎨ M ⎬<br />

⎩ ⎭<br />

E<br />

⎛ ⎧ E<br />

--- --- ⎫⎞<br />

⎜ ⎨ V Sort( V ), ( logB)<br />

V Sort( V ) + e logV<br />

⎬⎟<br />

⎝ ⎩ ⎭⎠<br />

[2, 54, 77, 128] (deterministic)<br />

E<br />

Θ ⎛--- V Sort( V ) ⎞<br />

⎝ ⎠<br />

[54, 77] (randomized)<br />

O min V 2 ⎛ ⎧ ⎧ V ⎫ ⎫⎞<br />

⎜ ⎨ , max ⎨1, log----<br />

⎩<br />

M<br />

⎬Sort(<br />

E),<br />

( logB)Sort(<br />

E)<br />

+ e logV<br />

⎬⎟<br />

⎝ ⎩ ⎭<br />

⎭⎠<br />

[2, 54, 128] (deterministic)<br />

O( Sort( E)<br />

) [54] (randomized)<br />

O( BundleSort( E, V ) + V ) [151]<br />

O( e loge+<br />

V)<br />

[128]<br />

⎛ ⎧ue ----- ⎫⎞<br />

O ⎜min ⎨m+ V, ( V + e)logu<br />

⎬⎟[49,<br />

54, 128]<br />

⎝ ⎩ ⎭⎠<br />

O Vu e ⎛ --- ⎞ [54]<br />

⎝ m⎠<br />

hash : {all possible keys} → {0, 1, 2, . . . , K − 1}<br />

that assigns the N items to K address locations in a uniform manner. Hashing algorithms differ from<br />

one another in how they resolve the collision that results when there is no room to store an item at its<br />

assigned location.<br />

The goals in EM hashing are to achieve an average of O(Output(Z)) = O(⎡z⎤)I/Os per lookup, where<br />

Z = zB is the number of items output, O(1)I/Os per insert and delete, and linear disk space. Most<br />

traditional hashing methods use a statically allocated table and are thus designed to handle only a fixed

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!