17.01.2013 Views

Algorithms and Data Structures for External Memory

Algorithms and Data Structures for External Memory

Algorithms and Data Structures for External Memory

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

12.3 Bootstrapping <strong>for</strong> 2-D Diagonal Corner<strong>and</strong> Stabbing Queries 107<br />

measured. Query per<strong>for</strong>mance was then tested as be<strong>for</strong>e. The results<br />

in Table 12.1 indicate that the buffer R*-tree <strong>and</strong> the Hilbert R-tree<br />

achieve a similar degree of packing, but the buffer R*-tree provides<br />

better update <strong>and</strong> query per<strong>for</strong>mance.<br />

12.3 Bootstrapping <strong>for</strong> 2-D Diagonal Corner<br />

<strong>and</strong> Stabbing Queries<br />

An obvious paradigm <strong>for</strong> developing an efficient dynamic EM data<br />

structure, given an existing data structure that works well when the<br />

problem fits into internal memory, is to “externalize” the internal memory<br />

data structure. If the internal memory data structure uses a binary<br />

tree, then a multiway tree such as a B-tree must be used instead. However,<br />

when searching a B-tree, it can be difficult to report all the items<br />

in the answer to the query in an output-sensitive manner. For example,<br />

in certain searching applications, each of the Θ(B) subtrees of a given<br />

node in a B-tree may contribute one item to the query answer, <strong>and</strong> as<br />

a result each subtree may need to be explored (costing several I/Os)<br />

just to report a single item of the answer.<br />

Fortunately, we can sometimes achieve output-sensitive reporting<br />

by augmenting the data structure with a set of filtering substructures,<br />

each of which is a data structure <strong>for</strong> a smaller version of the same problem.<br />

We refer to this approach, which we explain shortly in more detail,<br />

as the bootstrapping paradigm. Each substructure typically needs to<br />

store only O(B 2 ) items <strong>and</strong> to answer queries in O(log B B 2 + Z ′ /B) =<br />

O � ⌈Z ′ /B⌉ � I/Os, where Z ′ is the number of items reported. A substructure<br />

can even be static if it can be constructed in O(B) I/Os, since we<br />

can keep updates in a separate buffer <strong>and</strong> do a global rebuilding in<br />

O(B) I/Os whenever there are Θ(B) updates. Such a rebuilding costs<br />

O(1) I/Os (amortized) per update. We can often remove the amortization<br />

<strong>and</strong> make it worst-case using the weight-balanced B-trees of<br />

Section 11.2 as the underlying B-tree structure.<br />

Arge <strong>and</strong> Vitter [58] first uncovered the bootstrapping paradigm<br />

while designing an optimal dynamic EM data structure <strong>for</strong> diagonal<br />

corner two-sided 2-D queries (see Figure 12.1(a)) that meets all<br />

three design criteria listed in Chapter 12. Diagonal corner two-sided

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!