Algorithms and Data Structures for External Memory
Algorithms and Data Structures for External Memory
Algorithms and Data Structures for External Memory
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
12.3 Bootstrapping <strong>for</strong> 2-D Diagonal Corner<strong>and</strong> Stabbing Queries 107<br />
measured. Query per<strong>for</strong>mance was then tested as be<strong>for</strong>e. The results<br />
in Table 12.1 indicate that the buffer R*-tree <strong>and</strong> the Hilbert R-tree<br />
achieve a similar degree of packing, but the buffer R*-tree provides<br />
better update <strong>and</strong> query per<strong>for</strong>mance.<br />
12.3 Bootstrapping <strong>for</strong> 2-D Diagonal Corner<br />
<strong>and</strong> Stabbing Queries<br />
An obvious paradigm <strong>for</strong> developing an efficient dynamic EM data<br />
structure, given an existing data structure that works well when the<br />
problem fits into internal memory, is to “externalize” the internal memory<br />
data structure. If the internal memory data structure uses a binary<br />
tree, then a multiway tree such as a B-tree must be used instead. However,<br />
when searching a B-tree, it can be difficult to report all the items<br />
in the answer to the query in an output-sensitive manner. For example,<br />
in certain searching applications, each of the Θ(B) subtrees of a given<br />
node in a B-tree may contribute one item to the query answer, <strong>and</strong> as<br />
a result each subtree may need to be explored (costing several I/Os)<br />
just to report a single item of the answer.<br />
Fortunately, we can sometimes achieve output-sensitive reporting<br />
by augmenting the data structure with a set of filtering substructures,<br />
each of which is a data structure <strong>for</strong> a smaller version of the same problem.<br />
We refer to this approach, which we explain shortly in more detail,<br />
as the bootstrapping paradigm. Each substructure typically needs to<br />
store only O(B 2 ) items <strong>and</strong> to answer queries in O(log B B 2 + Z ′ /B) =<br />
O � ⌈Z ′ /B⌉ � I/Os, where Z ′ is the number of items reported. A substructure<br />
can even be static if it can be constructed in O(B) I/Os, since we<br />
can keep updates in a separate buffer <strong>and</strong> do a global rebuilding in<br />
O(B) I/Os whenever there are Θ(B) updates. Such a rebuilding costs<br />
O(1) I/Os (amortized) per update. We can often remove the amortization<br />
<strong>and</strong> make it worst-case using the weight-balanced B-trees of<br />
Section 11.2 as the underlying B-tree structure.<br />
Arge <strong>and</strong> Vitter [58] first uncovered the bootstrapping paradigm<br />
while designing an optimal dynamic EM data structure <strong>for</strong> diagonal<br />
corner two-sided 2-D queries (see Figure 12.1(a)) that meets all<br />
three design criteria listed in Chapter 12. Diagonal corner two-sided