15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

increment lowestPriorityOnDisk(disk(bi))) increment blocksWithPriority(priority(bi)) increment numberOfBlocksPlaced<br />

if (numberOfBlocksPlaced = M) then<br />

decrement numberOfBlocksPlaced by blocksWithPriority(lowestPriority)<br />

increment lowestPriority<br />

By using the priority assignment described here, it has been shown that L-OPT always creates a schedule<br />

that is within a factor Θ√(MD/L) times the length of the schedule created by the optimal offline algorithm,<br />

and that this is the best possible ratio. In addition, L-OPT’s schedule is never more than twice the length<br />

of that created by any online algorithm (including algorithms that consistently make fortuitously correct<br />

guesses) that has the same amount of lookahead. Finally, note that if the entire reference string is known<br />

in advance, then L-OPT is the optimal offline algorithm [32].<br />

33.7 Optimal Parallel-Disk Caching<br />

For general reference strings where blocks may be repeatedly accessed, the buffer manager must decide<br />

which blocks to cache and which to evict. As noted earlier, the optimal single-disk caching policy<br />

embodied in the MIN algorithm can be decidedly suboptimal in the parallel I/O case. Prefetching and<br />

caching need to harmoniously cooperate in the multiple-disk situation. The caching problem has been<br />

studied by several researchers in the recent past for different I/O organizations. For a distributed-buffer<br />

configuration where each disk has its own private buffer, an algorithm P-MIN that generalizes MIN to<br />

multiple disks was shown to be optimal [57]. P-MIN uses the furthest forward reference policy on each<br />

disk independently to determine the eviction candidate for that disk. It initiates an I/O only on demand;<br />

in the ensuing I/O operation it prefetches aggressively from every disk unless the reference to the block<br />

to be prefetched is further than the references of all blocks currently in that buffer. For a shared-buffer<br />

configuration in the stall-model of computation, a sophisticated near-optimal algorithm called Reverse-<br />

Aggressive to minimize the stall time was proposed and analyzed in [36].<br />

Recently, an optimal prefetching and caching algorithm, SUPERVISOR, for the parallel disk model<br />

was presented in [34]. Like the L-OPT algorithm for prefetching, SUPERVISOR uses the general framework<br />

of priority-controlled greedy I/O. The scheme for assigning priorities to references is, however,<br />

considerably more complex than that used by L-OPT for read-once reference strings. Just as a low priority<br />

with respect to prefetching indicates that an I/O for that block can be delayed, a low priority with respect<br />

to caching indicates that the block can be evicted from the buffer.<br />

Intuitively, SUPERVISOR assigns priorities in accordance with two principles: issue prefetches for<br />

blocks close to their reference so that they do not wastefully occupy buffer space, and avoid caching a<br />

block if there is any later free I/O slot available, which can be used to fetch it. Among possible candidates<br />

for a block to cache, it is desirable to cache a block that will occupy the buffer for a smaller duration.<br />

Hence, the question to be answered is: Given that at some time we would like two previously referenced<br />

blocks in the buffer, which of these should have been cached and which should be fetched now? It is<br />

preferable to cache the block whose previous reference is closer to the current time, as this reduces the<br />

buffer pressure between the two previous accesses. SUPERVISOR uses this intuition to assign priorities<br />

to blocks for prefetching and caching.<br />

The formal details of the priority assignment algorithm used by SUPERVISOR are presented in [34].<br />

The routine examines subsets of the lookahead consisting of M distinct references and then assigns priorities<br />

to one block from each disk. The idea behind the assignment can be understood by considering the largest<br />

subsequence of the lookahead including the last reference and having at most M distinct references. All<br />

blocks which are assigned the smallest priority should belong to this set. Otherwise there will be some<br />

reference such that M or more blocks referenced after it have a higher, or same priority. Which among<br />

these blocks should have the lowest priority? The lowest priority can be assigned to, at most, one distinct<br />

reference from each disk. Additionally, among two blocks from the same disk, this priority is assigned to<br />

the block with the previous reference outside this subsequence is earlier, because we would rather not<br />

© 2002 by CRC Press LLC

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!