10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

F.16. CHAPTER E: FORMAL VERIFICATION 325This sequence of events could repeat indefinitely,so that no finite value of GP_STAGES could preventdisrupting Task A. This sequence of events demonstratestheimportanceofthepromisemadebyCPUsthat acknowledge an increment of rcu_ctrlblk.completed, as the problem illustrated by the abovesequence of events is caused by Task B’s repeatedfailure to honor this promise.Therefore, more-pervasive changes to the graceperiodstate will be required in order for rcu_read_lock() to be able to safely dispense with irq disabling.Quick Quiz D.63:Why can’t the rcu_dereference() precede thememory barrier?Answer:Because the memory barrier is being executed inan interrupt handler, and interrupts are exact inthe sense that a single value of the PC is savedupon interrupt, so that the interrupt occurs ata definite place in the code. Therefore, if thercu_dereference() were to precede the memorybarrier, the interrupt would have had to have occurredafter the rcu_dereference(), and thereforethe interrupt would also have had to have occurredafter the rcu_read_lock() that begins the RCUread-side critical section. This would have forcedthe rcu_read_lock() to use the earlier value ofthe grace-period counter, which would in turn havemeant that the corresponding rcu_read_unlock()would have had to precede the first ”Old counterszero [0]” rather than the second one. This in turnwould have meant that the read-side critical sectionwould have been much shorter — which would havebeen counter-productive, given that the point ofthis exercise was to identify the longest possibleRCU read-side critical section.Quick Quiz D.64:<strong>What</strong> is a more precise way to say ”CPU 0 mightsee CPU 1’s increment as early as CPU 1’s lastprevious memory barrier”?Answer:First, it is important to note that the problemwith the less-precise statement is that it gives theimpression that there might be a single globaltimeline, which there is not, at least not for popularmicroprocessors. Second, it is important to notethat memory barriers are all about perceivedordering, not about time. Finally, a more preciseway of stating above statement would be as follows:”<strong>If</strong> CPU 0 loads the value resulting from CPU 1’sincrement, then any subsequent load by CPU 0 willsee the values from any relevant stores by CPU 1if these stores preceded CPU 1’s last prior memorybarrier.”Even this more-precise version leaves some wiggleroom. The word ”subsequent” must be understoodto mean ”ordered after”, either by an explicit memorybarrier or by the CPU’s underlying memory ordering.In addition, the memory barriers must bestrong enough to order the relevant operations. Forexample, CPU 1’s last prior memory barrier mustorder stores (for example, smp_wmb() or smp_mb()).Similarly, if CPU 0 needs an explicit memory barrierto ensure that its later load follows the one that sawthe increment, then this memory barrier needs to bean smp_rmb() or smp_mb().In general, much care is required when provingparallel algorithms.F.16 Chapter E: Formal VerificationQuick Quiz E.1:Why is there an unreached statement in locker?After all, isn’t this a full state-space search???Answer:The locker process is an infinite loop, so controlnever reaches the end of this process. However, sincethere are no monotonically increasing variables,Promela is able to model this infinite loop with asmall number of states.Quick Quiz E.2:<strong>What</strong> are some Promela code-style issues with thisexample?Answer:There are several:1. The declaration of sum should be moved towithin the init block, since it is not used anywhereelse.2. The assertion code should be moved outside ofthe initialization loop. The initialization loopcan then be placed in an atomic block, greatlyreducing the state space (by how much?).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!