29.11.2015 Views

The C11 and C++11 Concurrency Model

1ln7yvB

1ln7yvB

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

207<br />

atomic x(0); atomic y(0);<br />

T0 x.store(1,memory_order_release);<br />

T1 r1=x.load(memory_order_acquire);<br />

r2=y.load(memory_order_acquire);<br />

T2 y.store(1,memory_order_release);<br />

T3 r3=y.load(memory_order_acquire);<br />

r4=x.load(memory_order_acquire);<br />

<strong>The</strong> Power analogue of this program allows IRIW behaviour. Full sync instructions<br />

would be required between the loads in order to forbid this. <strong>The</strong> sync is stronger than<br />

an lwsync: it requires all Group A <strong>and</strong> program-order preceding writes to be propagated<br />

to all threads before the thread continues. This is enough to forbid the IRIW relaxed<br />

behaviour. If we replace the memory orders in the program above with the seq cst<br />

memory order, then the mapping would provide sync instructions between the reads, <strong>and</strong><br />

the IRIW outcome would be forbidden in the compiled Power program, as required.<br />

<strong>The</strong> examples presented here help to explain how the synchronisation instructions<br />

in the mapping forbid executions that the language does not allow. We can also show<br />

that if the mapping were weakened in any way, it would fail to correctly implement<br />

C/<strong>C++11</strong> [26]. <strong>The</strong> proof of this involves considering each entry in the mapping, weakening<br />

it, <strong>and</strong> observing some new relaxed behaviour that should be forbidden.<br />

For example, consider two of the cases in the mapping in the context of the examples<br />

above. First, if we remove the dependency from the implementation of consume atomics,<br />

then we enable read-side speculation in the message-passing example, <strong>and</strong> we allow the<br />

relaxed behaviour. Second, if we swap the sync for an lwsync in the implementation of<br />

SC loads, then we would be able to see IRIW behaviour in the example above.<br />

7.2.2 Overview of the formal proof<br />

<strong>The</strong> proof of correctness of the Power mapping (primarily the work of Sarkar <strong>and</strong> Memarian)<br />

has a similar form to the x86 result at the highest level, but the details are rather<br />

different. <strong>The</strong> Power model is an abstract machine, comprising a set of threads that<br />

communicate with a storage subsystem. In Chapter 2 the role of each component was explained:<br />

the threads enable speculation of read values, <strong>and</strong> the storage subsystem models<br />

the propagation of values throughout the processor.<br />

Write request<br />

Read request<br />

Barrier request<br />

Thread<br />

Read response<br />

Barrier ack<br />

Thread<br />

Storage Subsystem

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!