10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

130 CHAPTER 12. ADVANCED SYNCHRONIZATIONCPU 10000 11110000 1111 C=30000 1111 A=10000 1111 B=20000 11110000 1111 E=50000 1111 D=40000 1111wwwwwwwwwwwwwwwwEvents perceptibleto rest of systemAt this point the write barrierrequires all stores prior to thebarrier to be committed beforefurther stores may be take place.Sequence in which stores are committed to thememory system by CPU 1CPU 1Figure 12.7: Write Barrier Ordering Semantics000 111000 111 B=2000 111 A=1000 111000 111 C=&B000 111 D=4000 111wwwwwwwwwwwwwwwwApparently incorrectperception of B (!)The load of X holdsup the maintenanceof coherence of B0000 11110000 1111Y−>80000 1111C−>&Y0000 11110000 11110000 1111C−>&B0000 11110000 11110000 1111B−>70000 11110000 1111X−>90000 1111B−>20000 1111Sequence of updateof perception onCPU 2CPU 2Figure 12.8: Data Dependency Barrier Omittedthen the partial ordering imposed by CPU 1’swrite barrier will be perceived correctly by CPU 2,as shown in Figure 12.11.To illustrate this more completely, consider whatcould happen if the code contained a load of A eitherside of the read barrier, once again with the sameinitial values of {A=0,B=9}:CPU 1 CPU 2a = 1;b = 2;LOAD BLOAD A (1 st )LOAD A (2 nd )Even though the two loads of A both occur afterthe load of B, they may both come up with differentvalues, as shown in Figure 12.12.Of course, it may well be that CPU 1’s updateto A becomes perceptible to CPU 2 before the readbarrier completes, as shown in Figure 12.13.The guarantee is that the second load will alwayscome up with A==1 if the load of B came up withB==2. No such guarantee exists for the first load ofA; that may come up with either A==0 or A==1.12.2.10.8 Read Memory Barriers vs. LoadSpeculationMany CPUs speculate with loads: that is, they seethat they will need to load an item from memory,and they find a time where they’re not using the busfor any other loads, and then do the load in advance— even though they haven’t actually got to thatpoint in the instruction execution flow yet. Later on,this potentially permits the actual load instructionto complete immediately because the CPU already

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!