09.08.2013 Views

Design and Verification of Adaptive Cache Coherence Protocols ...

Design and Verification of Adaptive Cache Coherence Protocols ...

Design and Verification of Adaptive Cache Coherence Protocols ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

serves values that are consistent with some serial execution order <strong>of</strong> the dag, but di erent<br />

threads can observe di erent serial orders. Thus, store operations performed by a thread can<br />

be observed by its successors, but threads that are incomparable in the dag may or may not<br />

observe each other's write operations. Dag consistency allows load operations to retrieve values<br />

that are based on di erent sequential executions, provided that data dependencies <strong>of</strong> the dag<br />

are respected.<br />

In this thesis, we propose the Commit-Reconcile & Fences (CRF) model, a mechanism-oriented<br />

memory model that is intended for architects <strong>and</strong> compiler writers. The CRF model has a<br />

semantic notion <strong>of</strong> caches which makes the operational behavior <strong>of</strong> data replication to be part<br />

<strong>of</strong> the model. It decomposes memory access operations into some ner-grain instructions: a<br />

load operation becomes a Reconcile followed by a Loadl, <strong>and</strong> a store operation becomes a Storel<br />

followed by a Commit. Anovel feature <strong>of</strong> CRF is that programs written under various memory<br />

models such as sequential consistency <strong>and</strong> release consistency can be translated into e cient<br />

CRF programs. With appropriate extension, CRF can also be used to de ne <strong>and</strong> improve<br />

program-oriented memory models such astheJava memory model [85].<br />

1.2 <strong>Cache</strong> <strong>Coherence</strong> <strong>Protocols</strong><br />

Shared memory multiprocessor systems provide a global address space in which processors can<br />

exchange information <strong>and</strong> synchronize with one another. When shared variables are cached in<br />

multiple caches simultaneously, a memory store operation performed by one processor can make<br />

data copies <strong>of</strong> the same variable in other caches out <strong>of</strong> date. The primary objective <strong>of</strong> a cache<br />

coherence protocol is to provide a coherent memory image for the system so that each processor<br />

can observe the semantic e ect <strong>of</strong> memory access operations performed by other processors in<br />

time.<br />

Shared memory systems can be implemented using di erent approaches. Typical hardware<br />

implementations extend traditional caching techniques <strong>and</strong> use custom communication inter-<br />

faces <strong>and</strong> speci c hardware support [5, 10, 20, 24, 79, 91, 93]. In Non-Uniform Memory Access<br />

(NUMA) systems, each site contains a memory-cache controller that determines if an access<br />

is to the local memory or some remote memory, based on the physical address <strong>of</strong> the memory<br />

access. In <strong>Cache</strong> Only Memory Architecture (COMA) systems, each local memory behaves as<br />

a large cache that allows shared data to be replicated <strong>and</strong> migrated. Simple COMA (S-COMA)<br />

systems divide the task <strong>of</strong> managing the global virtual address space between hardware <strong>and</strong> s<strong>of</strong>t-<br />

ware to provide automatic data migration <strong>and</strong> replication with much less expensive hardware<br />

support. Hybrid systems can combine the advantages <strong>of</strong> NUMA <strong>and</strong> COMA systems [43, 65].<br />

S<strong>of</strong>tware shared memory systems can employ virtual memory management mechanisms to<br />

achieve sharing <strong>and</strong> coherence [9, 81]. They <strong>of</strong>ten exhibit COMA-like properties by migrating<br />

<strong>and</strong> replicating pages between di erent sites. One attractive way to build scalable shared<br />

memory systems is to use small-scale to medium-scale shared memory machines as clusters<br />

20

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!