10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

12.2. MEMORY BARRIERS 123Quick Quiz 12.8: How could the code onpage 122 possibly leak memory?Suppose that the third property did not hold.Then the counter shown in the following code mightwell count backwards. This third property is crucial,as it cannot be strictly with pairwise memorybarriers.spin_lock(&mylock);ctr = ctr + 1;spin_unlock(&mylock);Quick Quiz 12.9: How could the code onpage 122 possibly count backwards?<strong>If</strong>you areconvinced that these rulesare necessary,let’s look at how they interact with a typical lockingimplementation.12.2.5 Review of Locking ImplementationsNaive pseudocode for simple lock and unlock operationsare shown below. Note that the atomic_xchg() primitive implies a memory barrier both beforeand after the atomic exchange operation, whicheliminates the need for an explicit memory barrierin spin_lock(). Note also that, despite the names,atomic_read() and atomic_set() do not executeany atomic instructions, instead, it merely executesa simple load and store, respectively. This pseudocodefollows a number of Linux implementationsfor the unlock operation, which is a simple nonatomicstorefollowingamemorybarrier.Theseminimalimplementations must possess all the lockingproperties laid out in Section 12.2.4.1 void spin_lock(spinlock_t *lck)2 {3 while (atomic_xchg(&lck->a, 1) != 0)4 while (atomic_read(&lck->a) != 0)5 continue;6 }78 void spin_unlock(spinlock_t lck)9 {10 smp_mb();11 atomic_set(&lck->a, 0);12 }The spin_lock() primitive cannot proceed untilthe preceding spin_unlock() primitive completes.<strong>If</strong>CPU1isreleasingalockthatCPU2isattemptingto acquire, the sequence of operations might be asfollows:CPU 1 CPU 2(critical section) atomic_xchg(&lck->a,1)->1smp_mb(); lck->a->1lck->a=0; lck->a->1lck->a->0(implicit smp_mb() 1)atomic_xchg(&lck->a,1)->0(implicit smp_mb() 2)(critical section)In this particular case, pairwise memory barrierssuffice to keep the two critical sections inplace. CPU 2’s atomic_xchg(&lck->a,1) hasseen CPU 1’s lck->a=0, so therefore everything inCPU 2’s following critical section must see everythingthat CPU 1’s preceding critical section did.Conversely, CPU 1’s critical section cannot see anythingthat CPU 2’s critical section will do.@@@12.2.6 A Few Simple Rules@@@Probably the easiest way to understand memorybarriers is to understand a few simple rules:1. Each CPU sees its own accesses in order.2. <strong>If</strong> a single shared variable is loaded and storedby multiple CPUs, then the series of values seenbyagivenCPUwillbeconsistentwiththeseriesseen by the other CPUs, and there will be atleast one sequence consisting of all values storedto that variable with which each CPUs serieswill be consistent. 33. <strong>If</strong> one CPU does ordered stores to variables AandB, 4 , andifasecondCPUdoesorderedloadsfrom B and A, 5 , then if the second CPU’s loadfrom B gives the value stored by the first CPU,then the second CPU’s load from A must givethe value stored by the first CPU.4. <strong>If</strong> one CPU does a load from A ordered beforea store to B, and if a second CPU does a loadfrom B ordered before a store from A, and ifthe second CPU’s load from B gives the valuestored by the first CPU, then the first CPU’sload from A must not give the value stored bythe second CPU.3 A given CPU’s series may of course be incomplete, forexample, if a given CPU never loaded or stored the sharedvariable, then it can have no opinion about that variable’svalue.4 For example, by executing the store to A, a memory barrier,and then the store to B.5 For example, by executing the load from B, a memorybarrier, and then the load from A.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!