15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

In contrast to prefetching that masks disk latencies by overlapping the access with that of I/Os to other<br />

disks, caching attempts to exploit temporal locality in the accesses. A selected subset of the recently<br />

accessed blocks are held in the I/O buffer in the expectation that they will be referenced again soon,<br />

thereby avoiding repeated disk accesses for the same block. Although both prefetching and caching are<br />

well-known techniques employed ubiquitously in computer systems and networking, deploying these<br />

mechanisms effectively in a parallel I/O system raises a unique set of challenges.<br />

The I/O schedule determines the set of blocks that are fetched in each parallel I/O operation. The<br />

schedule is constructed dynamically so as to minimize the total number of parallel I/Os. This requires<br />

the scheduler to decide which blocks to prefetch, and, when the need for replacement arises, to decide<br />

which blocks in the buffer to cache and which to evict. Prefetching and caching in parallel I/O systems<br />

is fundamentally different from that in systems with a single disk, and requires the use of substantially<br />

different algorithms [11,32–36]. In a single-disk system, prefetching is used to overlap I/O operations<br />

with CPU computations. This is usually done using asynchronous I/O whereby a computation continues<br />

after making the I/O request without blocking. A stall model for analyzing the performance of overlapped<br />

I/O and computation was proposed in [17] for a single disk system; prefetching and caching algorithms<br />

to minimize stall time as a function of CPU and I/O speeds were presented in [5,17]. Disk scheduling<br />

algorithms that reorder I/O requests to minimize the disk seek times [59] can also be considered as a<br />

form of prefetching in single-disk systems.<br />

In parallel I/O systems prefetching allows overlap between accesses on different disks thereby hiding<br />

the I/O latency behind the access latency on some other disk. The scheduler has to judiciously decide on<br />

questions like how much buffer to allocate for prefetching and how much for caching, which blocks to<br />

prefetch, and which blocks to cache. For instance, to utilize the available bandwidth, it may appear<br />

desirable to keep a large number of disks busy prefetching data during an I/O; however, excessive<br />

prefetching can fill up the buffer with blocks, which may not be used until much later in the computation.<br />

Such blocks have the adverse effects of choking the buffer and reducing the parallelism in fetching more<br />

immediate blocks. In fact, even when the problem does not involve the use of caching, the decisions of<br />

which blocks to prefetch and when to do so is not trivial.<br />

Another issue needs to be addressed to employ prefetching and caching effectively. In order to prefetch<br />

accurately (rather than speculatively) some knowledge of future accesses is required. This is embodied<br />

in the notion of lookahead, which is a measure of the extent of knowledge about the future accesses that<br />

is available in making prefetching and caching decisions. Obtaining this lookahead has been the area of<br />

much active research [13,40,43,50]. In some applications like external sorting the lookahead can be obtained<br />

dynamically by using a sample of the data to accurately predict the sequence of block requests [10]. In<br />

video retrieval the sequence is determined by the playback times of blocks in the set of concurrently<br />

accessed streams; summary statistics of the streams are used to obtain the lookahead at run time [22].<br />

Indexes in database systems can similarly be used to provide information about the actual sequence of<br />

data blocks that must be accessed. In broadcast servers the set of requests are prioritized by the system<br />

to maximize utilization of the broadcast channel [4]; the prioritized request sequence provides the<br />

lookahead for required I/O accesses. Access patterns can be revealed to the system either using programmer<br />

provided hints [50], or the system may attempt to uncover sequential or strided access patterns<br />

automatically at run time [40]. Speculative execution is another technique based on executing program<br />

code speculatively to determine the control path and the blocks accessed in the path [13].<br />

33.5 Limitations of Simple Prefetching<br />

and Caching Strategies<br />

In [11,32], the problem of scheduling read-once reference strings, in which each block is accessed exactly<br />

once, was considered. Such reference strings are characteristic of streaming applications like multimedia<br />

retrieval. Simple intuitive algorithms that work well in a single-disk scenario were analyzed and shown<br />

to have poor performance in the multiple-disk case. For instance, consider a natural scheduling algorithm<br />

© 2002 by CRC Press LLC

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!