15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

The ERF algorithm was analyzed in [11]. It was shown that there exist reference strings for which ERF<br />

will perform Θ(<br />

√D)<br />

times as many I/Os as the optimal schedule. For the average case, under the same<br />

assumptions as for aggressive prefetching, it can be shown that ERF can read an N block reference string<br />

in Θ(<br />

N/<br />

D)<br />

I/Os using a buffer of size Ω(<br />

D log D)<br />

blocks [10]. Hence, although ERF improves upon<br />

aggressive prefetching, it does not construct the optimal-length schedule.<br />

In the previous discussion all blocks were implicitly assumed to be distinct. Such reference strings are<br />

called read-once and are characteristic of streaming applications like multimedia retrieval. General<br />

reference strings where each block can be accessed repeatedly introduce additional issues related to caching.<br />

In particular, decisions need to be made regarding which blocks to evict from the buffer. In a single-disk<br />

system the optimal offline caching strategy is to use the MIN algorithm [12] that always evicts the block<br />

whose next reference is furthest in the future; however, it is easy to show that using this policy in a<br />

multiple-disk situation does not necessarily minimize the total number of parallel I/Os that are required.<br />

In fact, there exist reference strings for which the use of the MIN policy necessitates Θ(<br />

D)<br />

times as many<br />

I/Os as an optimal caching strategy [34].<br />

33.6 Optimal Parallel-Disk Prefetching<br />

In this section we present an online prefetching algorithm L-OPT for read-once reference strings. L-OPT<br />

uses L-block lookahead; at any instant L-OPT knows the next L references, and uses this lookahead to<br />

determine blocks to fetch in the next I/O. It uses a priority assignment scheme to determine the currently<br />

most useful blocks to fetch and to retain in the buffer. As the lookahead window advances and information<br />

about further requests are made available, the priorities of blocks are dynamically updated to incorporate<br />

the latest information. When considered as an offline algorithm for which the entire reference string is<br />

known in advance, it has been shown that L-OPT is the optimal prefetching algorithm that minimizes the<br />

number of parallel I/Os [32].<br />

L-OPT is a priority-controlled greedy prefetching algorithm. A priority-controlled greedy prefetching<br />

scheme provides a general framework for describing different prefetching algorithms. Blocks in the lookahead<br />

are assigned priorities depending on the scheduling policy in effect. The scheduler fetches one block<br />

each from as many disks as possible in every I/O, while ensuring that the buffer never retains a lowerpriority<br />

block in preference to fetching one with a higher priority, if necessary by evicting the lowerpriority<br />

blocks. Algorithm priority-controlled greedy I/O describes the algorithm formally using the<br />

definitions below.<br />

Different prefetching policies can be implemented using this framework merely by changing the<br />

priority function. For instance, to implement the ERF prefetching algorithm the priority of blocks should<br />

decrease with their position in the reference string. This is easily achieved if the priority function assigns<br />

the ith<br />

block in the reference string a priority equal to −i.<br />

Similarly, prefetching strategies akin to aggressive<br />

prefetching can be emulated by assigning the ith<br />

referenced block from each disk a priority of + ∞ if it<br />

is the demand block and −i<br />

otherwise.<br />

Definitions<br />

1. Let Σ = b1,<br />

b2,…,bn<br />

denote the reference string. If bi<br />

is a block in the lookahead, let disk(bi)<br />

denote<br />

the disk from which it needs to be fetched and let priority(bi) be the block’s priority.<br />

2. At the instant when bi is referenced, let Bi denote the set of blocks in the lookahead that are present<br />

in the buffer.<br />

3. When bi is referenced, let Hi be the maximal set of (up to) D blocks, such that if b ∈ Hi then<br />

priority of b is the largest among all blocks from disk(b) in the lookahead but not present in the<br />

buffer.<br />

+<br />

4. Let Bi be the maximal set of (up to) M blocks with the highest priorities in Hi ∪ Bi; in the case<br />

of ties the block occurring earlier in Σ is preferred.<br />

© 2002 by CRC Press LLC

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!