17.01.2013 Views

Algorithms and Data Structures for External Memory

Algorithms and Data Structures for External Memory

Algorithms and Data Structures for External Memory

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

44 <strong>External</strong> Sorting <strong>and</strong> Related Problems<br />

Prefetch buffers /<br />

Output buffers<br />

Stream of blocks are<br />

read in Σ order<br />

1 2 3 4 5 6<br />

D = 6 Disks<br />

written in ΣR Stream of blocks are<br />

order<br />

Up to m<br />

prefetched or<br />

queued blocks<br />

D = 6<br />

Disk numbers<br />

Internal <strong>Memory</strong><br />

Correspondence between<br />

output step in greedy write-once schedule<br />

<strong>and</strong> prefetching step in lazy read-once schedule<br />

Fig. 5.4 Duality between prefetch scheduling <strong>and</strong> output scheduling. The prefetch scheduling<br />

problem <strong>for</strong> sequence Σ proceeds from bottom to top. Blocks are input from the disks,<br />

stored in the prefetch buffers, <strong>and</strong> ultimately read by the application program in the order<br />

specified by the sequence Σ. The output scheduling problem <strong>for</strong> the reverse sequence Σ R<br />

proceeds from top to bottom. Blocks are written by the application program in the order<br />

specified by Σ R , queued in the output buffers, <strong>and</strong> ultimately output to the disks. The<br />

hatched blocks illustrate how the blocks of disk 2 might be distributed.<br />

5.3.1 Greedy Read-Once Scheduling<br />

Be<strong>for</strong>e we discuss an optimum prefetching algorithm <strong>for</strong> read-once<br />

scheduling, we shall first look at the following natural approach adopted<br />

by SRM [68, 72] in Section 5.2.1, which un<strong>for</strong>tunately does not achieve<br />

the optimum schedule length. It uses a greedy approach: Suppose that<br />

blocks b1, b2, ..., bi of the sequence Σ have already been read in prior<br />

steps <strong>and</strong> are thus removed from the prefetch buffers. The current step<br />

consists of reading the next blocks of Σ that are already in the prefetch<br />

buffers. That is, suppose blocks bi+1, bi+2, ..., bj are in the prefetch<br />

buffers, but bj+1 is still on a disk. Then blocks bi+1, bi+2, ..., bj are<br />

read <strong>and</strong> removed from the prefetch buffers.<br />

The second part of the current step involves input from the disks.<br />

For each of the D disks, consider its highest priority block not yet input,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!