09.08.2013 Views

Design and Verification of Adaptive Cache Coherence Protocols ...

Design and Verification of Adaptive Cache Coherence Protocols ...

Design and Verification of Adaptive Cache Coherence Protocols ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CRF model has both upward <strong>and</strong> downward compatibility, that is, the ability to run existing<br />

programs correctly <strong>and</strong> e ciently on a CRF machine, <strong>and</strong> the ability to run CRF programs<br />

well on existing machines.<br />

The CRF model was motivated from a directory-based cache coherence protocol that was<br />

originally designed for the MIT Start-Voyager multiprocessor system [11, 12]. The protocol<br />

veri cation requires a memory model that speci es the legal memory behaviors that the protocol<br />

was supposed to implement. Ironically, we did not have such a speci cation even after the<br />

protocol design was completed. As a result, it was not clear what invariants must be proved<br />

in order to ensure that the cache coherence protocol always exhibits the same behavior as<br />

sequential consistency for properly synchronized programs. The lack <strong>of</strong> a precise speci cation<br />

also made it di cult to underst<strong>and</strong> <strong>and</strong> reason about the system behavior for programs that<br />

may have data races. To deal with this dilemma, we transformed the protocol by eliminating<br />

various directive operations <strong>and</strong> implementation details such as message queues, preserving the<br />

semantics throughout the transformation. This eventually led to CRF, an abstract protocol<br />

that cannot be further simpli ed. The CRF model can be implemented e ciently for shared<br />

memory systems, largely because the model itself is an abstraction <strong>of</strong> a highly optimized cache<br />

coherence protocol.<br />

We have designed a set <strong>of</strong> adaptive cache coherence protocols which are optimized for some<br />

common access patterns. The Base protocol is the most straightforward implementation <strong>of</strong><br />

CRF, <strong>and</strong> is ideal for programs in which only necessary commit <strong>and</strong> reconcile operations are<br />

performed. The WP protocol allows a reconcile operation on a clean cache cell to complete<br />

without purging the cell so that the data can be accessed by subsequent load operations without<br />

causing a cache miss this is intended for programs that contain excessive reconcile operations.<br />

The Migratory protocol allows an address to be cachedinatmostonecache it ts well when<br />

one processor is likely to access an address many times before another processor accesses the<br />

same address.<br />

We further developed an adaptive cache coherence protocol called <strong>Cache</strong>t that provides<br />

a wide scope <strong>of</strong> adaptivity for DSM systems. The <strong>Cache</strong>t protocol is a seamless integration<br />

<strong>of</strong> multiple micro-protocols, <strong>and</strong> embodies both intra-protocol <strong>and</strong> inter-protocol adaptivity<br />

to achieve high performance under changing memory access patterns. A cache cell can be<br />

modi ed without the so-called exclusive ownership, which e ectively allows multiple writers<br />

for the same memory location simultaneously. This can reduce the average latency for write<br />

operations <strong>and</strong> alleviate potential cache thrashing due to false sharing. Moreover, the purge<br />

<strong>of</strong> an invalidated cache cell can be deferred to the next reconcile point, which can help reduce<br />

cache thrashing due to read-write false sharing. An early version <strong>of</strong> <strong>Cache</strong>t with only the Base<br />

<strong>and</strong> WP micro-protocols can be found elsewhere [113]. Since <strong>Cache</strong>t implements the CRF<br />

model, it is automatically a protocol that implements the memory models whose programs can<br />

be translated into CRF programs.<br />

Our view <strong>of</strong> adaptive protocols contains three layers: m<strong>and</strong>atory rules, voluntary rules<br />

162

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!