09.08.2013 Views

Design and Verification of Adaptive Cache Coherence Protocols ...

Design and Verification of Adaptive Cache Coherence Protocols ...

Design and Verification of Adaptive Cache Coherence Protocols ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

that are interconnected with an o -the-shelf network [41, 126]. Such systems can e ectively<br />

couple hardware cache coherence with s<strong>of</strong>tware DSM systems. Shared memory systems can<br />

also be implemented via compilers that convert shared memory accesses into synchronization<br />

<strong>and</strong> coherence primitives [105, 107].<br />

1.2.1 Snoopy <strong>Protocols</strong> <strong>and</strong> Directory-based <strong>Protocols</strong><br />

There are two types <strong>of</strong> cache coherence protocols: snoopy protocols for bus-based systems <strong>and</strong><br />

directory-based protocols for DSM systems. In bus-based multiprocessor systems, since an<br />

ongoing bus transaction can be observed by all the processors, appropriate coherence actions<br />

can be taken when an operation threatening coherence is detected. <strong>Protocols</strong> that fall into<br />

this category are called snoopy protocols because each cache snoops bus transactions to watch<br />

memory transactions <strong>of</strong> other processors. Various snoopy protocols have been proposed [13, 38,<br />

52,53,67,96]. When a processor reads an address not in its cache, it broadcasts a read request<br />

on the snoopy bus. Memory or the cache that has the most up-to-date copy will then supply<br />

the data. When a processor broadcasts its intention to write an address which itdoesnotown<br />

exclusively, other caches need to invalidate or update their copies.<br />

Unlike snoopy protocols, directory-based protocols do not rely upon the broadcast mech-<br />

anism to invalidate or update stale copies. They maintain a directory entry for each memory<br />

block to record the cache sites in which the memory block is currently cached. The directory<br />

entry is <strong>of</strong>ten maintained at the site in which the corresponding physical memory resides. Since<br />

the locations <strong>of</strong> shared copies are known, the protocol engine at each site can maintain coher-<br />

ence by employing point-to-point protocol messages. The elimination <strong>of</strong> broadcast overcomes<br />

a major limitation on scaling cache coherent machines to large-scale multiprocessor systems.<br />

A directory-based cache coherence protocol can be implemented with various directory struc-<br />

tures [6, 25, 26]. The full-map directory structure [77] maintains a complete record <strong>of</strong> which<br />

caches are sharing the memory block. In a straightforward implementation, each directory<br />

entry contain one bit per cache site representing if that cache has a shared copy. Its main<br />

drawback is that the directory space can be intolerable for large-scale systems. Alternative<br />

directory structures have been proposed to overcome this problem [27, 115, 118]. Di erent<br />

directory structures represent di erent implementation tradeo s between performance <strong>and</strong> im-<br />

plementation complexity <strong>and</strong> cost [89, 94, 114].<br />

A cache coherence protocol always implements some memory model that de nes the se-<br />

mantics for memory access operations. Most snoopy protocols ensure sequential consistency,<br />

provided that memory accesses are performed in order. More sophisticated cache coherence<br />

protocols can implement relaxed memory models to improve performance.<br />

1.2.2 <strong>Adaptive</strong> <strong>Cache</strong> <strong>Coherence</strong> <strong>Protocols</strong><br />

Shared memory programs have various access patterns [116, 124]. Empirical evidence suggests<br />

that no xed cache coherence protocol works well for all access patterns [15, 38, 39, 42, 122]. For<br />

21

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!