09.08.2013 Views

Design and Verification of Adaptive Cache Coherence Protocols ...

Design and Verification of Adaptive Cache Coherence Protocols ...

Design and Verification of Adaptive Cache Coherence Protocols ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

memory regions simultaneously. In a program that assumes release consistency, for example,<br />

the memory region used for input <strong>and</strong> output operations can have the semantics <strong>of</strong> sequential<br />

consistency by employing an appropriate translation scheme for that region.<br />

With both <strong>Cache</strong>t <strong>and</strong> CRF speci ed in Term Rewriting Systems, we can formally prove<br />

that the <strong>Cache</strong>t protocol is a correct implementation <strong>of</strong> the CRF model <strong>and</strong> is free from any type<br />

<strong>of</strong> deadlock orlivelock. The veri cation <strong>of</strong> <strong>Cache</strong>t follows the same procedure as the veri cation<br />

<strong>of</strong> the Base, WP <strong>and</strong> Migratory protocols. To prove the soundness, for example, we de ne a<br />

mapping function from <strong>Cache</strong>t to CRF, <strong>and</strong> show that each basic imperative rule <strong>of</strong> <strong>Cache</strong>t<br />

can be simulated in CRF. The mapping function is based on the notion <strong>of</strong> drained terms, in<br />

which all message queues are empty. There are many ways to drain messages from the network.<br />

For example, we can employ backward draining for Wbb messages, <strong>and</strong> forward draining for all<br />

other messages. The draining rules include Rules IC8-IC22, IM3-IM5, IM10-IM13, <strong>and</strong> some<br />

additional rules that allow Wbb messages to be reclaimed at the cache sites from which they<br />

were issued. Furthermore, we can downgrade all Migratory cells to WP cells to ensure that the<br />

memory in a drained term always contains the most up-to-date data. This can be achieved by<br />

including Rules IC5 <strong>and</strong> IC6 in the draining rules. The Imperative-&-Directive methodology,<br />

together with the classi cation <strong>of</strong> basic <strong>and</strong> composite operations, has signi cantly simpli ed<br />

the veri cation by reducing the number <strong>of</strong> rules that need to be considered.<br />

Coarse-grain <strong>Coherence</strong> States An implementation can maintain coherence states for<br />

cache lines rather than individual cache cells. In modern computer systems, the size <strong>of</strong> cache<br />

lines typically ranges from 32 to 128 bytes. The state <strong>of</strong> a cache line is a concise representation<br />

<strong>of</strong> the states <strong>of</strong> the cells in the cache line. The Clean w state, for example, means that all the<br />

cache cells <strong>of</strong> the cache line are in the Cleanw state. When the cache line is modi ed by a<br />

Storel instruction, the state becomes Cleanw Dirty + w ,whichimplies that at least one cache cell<br />

<strong>of</strong> the cache line is in the Dirtyw state while all other cache cells, if any, are in the Cleanw<br />

state. <strong>Coherence</strong> actions such as cache, writeback, downgrade <strong>and</strong> upgrade operations are all<br />

performed at the cache line granularity. This ensures that all the cache cells in a cache line<br />

employ the same micro-protocol at any time.<br />

Since a cache cell can be modi ed without the exclusive ownership, there can be multiple<br />

writers for the same cache line simultaneously. This can happen even for data-race-free programs<br />

because <strong>of</strong> false sharing. As a result, when a cache line in the Clean w Dirty + w state is written<br />

back to the memory, the memory should be able to distinguish between clean <strong>and</strong> dirty cells<br />

because only the data <strong>of</strong> dirty cells can be used to update the memory. A straightforward<br />

solution is to maintain a modi cation bit for each cache cell. At the writeback time, the<br />

modi cation bits are also sent back so that the memory can tell which cache cells have been<br />

modi ed. By allowing a cache line to be modi ed without the exclusive ownership, the <strong>Cache</strong>t<br />

protocol not only reduces the latency <strong>of</strong> write operations, but also alleviates potential cache<br />

thrashing due to false sharing.<br />

159

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!