10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

12.2. MEMORY BARRIERS 121CPU 1 CPU 2 Description0 load(A) load(B) load(B) load(A) Ears to ears.1 load(A) load(B) load(B) store(A) Only one store.2 load(A) load(B) store(B) load(A) Only one store.3 load(A) load(B) store(B) store(A) Pairing 1.4 load(A) store(B) load(B) load(A) Only one store.5 load(A) store(B) load(B) store(A) Pairing 2.6 load(A) store(B) store(B) load(A) Mouth to mouth, ear to ear.7 load(A) store(B) store(B) store(A) Pairing 3.8 store(A) load(B) load(B) load(A) Only one store.9 store(A) load(B) load(B) store(A) Mouth to mouth, ear to ear.A store(A) load(B) store(B) load(A) Ears to mouths.B store(A) load(B) store(B) store(A) Stores “pass in the night”.C store(A) store(B) load(B) load(A) Pairing 1.D store(A) store(B) load(B) store(A) Pairing 3.E store(A) store(B) store(B) load(A) Stores “pass in the night”.F store(A) store(B) store(B) store(A) Stores “pass in the night”.Table 12.1: Memory-Barrier Combinationsbe either 0 or 1.Pairing 2. In this pairing, each CPU executesa load followed by a memory barrierfollowed by a store, as follows (bothA and B are initially equal to zero):CPU 1 CPU 2X=A; Y=B;smp_mb(); smp_mb();B=1; A=1;After both CPUs have completed executing thesecode sequences, if X==1, then we must also haveY==0. In this case, the fact that X==1 means thatCPU 1’s load prior to its memory barrier has seenthe store following CPU 2’s memory barrier. Dueto the pairwise nature of memory barriers, CPU 1’sstore following its memory barrier must thereforesee the results of CPU 2’s load preceding its memorybarrier, so that Y==0.On the other hand, if X==0, the memory-barriercondition does not hold, and so in this case, Y couldbe either 0 or 1.The two CPUs’ code sequences are symmetric,so if Y==1 after both CPUs have finished executingthese code sequences, then we must have X==0.Pairing 3. In this pairing, one CPU executesa load followed by a memory barrier followed bya store, while the other CPU executes a pairof stores separated by a memory barrier, as follows(both A and B are initially equal to zero):CPU 1 CPU 2X=A; B=2;smp_mb(); smp_mb();B=1; A=1;After both CPUs have completed executing thesecode sequences, if X==1, then we must also haveB==1. In this case, the fact that X==1 means thatCPU 1’s load prior to its memory barrier has seenthe store following CPU 2’s memory barrier. Dueto the pairwise nature of memory barriers, CPU 1’sstorefollowing itsmemorybarriermustthereforeseethe results of CPU 2’s store preceding its memorybarrier. This means that CPU 1’s store to B willoverwrite CPU 2’s store to B, resulting in B==1.On the other hand, if X==0, the memory-barriercondition does not hold, and so in this case, B couldbe either 1 or 2.12.2.4.5 Pair-Wise Memory Barriers: Semi-Portable CombinationsThe following pairings from Table 12.1 can be usedon modern hardware, but might fail on some systemsthat were produced in the 1990s. However,these can safely be used on all mainstream hardwareintroduced since the year 2000.Ears to Mouths. Since the stores cannot see theresults of the loads (again, ignoring MMIO registersfor the moment), it is not always possible to determinewhether the memory-barrier condition hasbeen met. However, recent hardware would guaran-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!