29.11.2015 Views

The C11 and C++11 Concurrency Model

1ln7yvB

1ln7yvB

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

37<br />

ItisnotyetobviousthatthewritesofThread1mustbepropagatedtoThread3before<br />

the write of Thread 2, however. A new guarantee called B-cumulativity, provides ordering<br />

through executions with chained dependencies following an lwsync. In this example, B-<br />

cumulativity ensures that the store of x is propagated to Thread 3 before the store of<br />

z.<br />

<strong>The</strong> example above shows that B-cumulativity extends ordering to the right of an<br />

lwsync, A-cumulativity extends ordering to the left: consider the following example,<br />

called WRC+lwsync+addr for write-to-read causality [37] with an lwsync <strong>and</strong> an address<br />

dependency. Thread 1 writes x, <strong>and</strong> Thread 2 writes y:<br />

int x = 0;<br />

int y = 0;<br />

x = 1; r1 = x; r2 = y;<br />

lwsync r3 = *(&x + r2 - r2);<br />

y = 1;<br />

<strong>The</strong> lwsync <strong>and</strong> the address dependency prevent thread-local speculation from occurring<br />

on Threads 2 <strong>and</strong> 3, but there is so far nothing to force the write on Thread 1<br />

to propagate to Thread 3 before the write of Thread 2, allowing the outcome 1/1/0 for<br />

the reads of x on Thread 2, y on Thread 3 <strong>and</strong> x on Thread 3. We define the group A<br />

writes to be those that have propagated to the thread of an lwsync at the point that<br />

it is executed. <strong>The</strong> write of x is in the group A of the lwsync in the execution of the<br />

program above. A-cumulativity requires group A writes to propagate to all threads before<br />

writes that follow the barrier in program order, guaranteeing that the outcome 1/1/0 is<br />

forbidden on Power <strong>and</strong> ARM architectures; it provides ordering to dependency chains to<br />

the left of an lwsync.<br />

Note that in either case of cumulativity, if the dependencies were replaced by lwsyncs<br />

or syncs, then the ordering would still be guaranteed.<br />

Load-linked store-conditional <strong>The</strong>Power<strong>and</strong>ARMarchitecturesprovideload-linked<br />

(LL) <strong>and</strong> store-conditional (SC) instructions that allow the programmer to load from a<br />

location, <strong>and</strong> then store only if no other thread accessed the location in the interval<br />

between the two. <strong>The</strong>se instructions allow the programmer to establish consensus as the<br />

global lock did in x86. <strong>The</strong> load-linked instruction is a load from memory that works in<br />

conjunction with a program-order-later store-conditional. <strong>The</strong> store-conditional has two<br />

possible outcomes; it can store to memory, or it may fail if the coherence commitment<br />

order is sufficiently unconstrained, allowing future steps of the abstract machine to place<br />

writes before it. On success, load-linked <strong>and</strong> store-conditional instructions atomically<br />

read <strong>and</strong> then write a location, <strong>and</strong> can be used to implement language features like<br />

compare-<strong>and</strong>-swap.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!