10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

254 APPENDIX E. FORMAL VERIFICATION4. Therefore, at any given point in time, either oneof the counters will be at least 2, or both of thecounters will be at least one.5. However, the synchronize_qrcu() fastpathcodecanreadonlyoneofthecountersatagiventime. <strong>It</strong> is therefore possible for the fastpathcode to fetch the first counter while zero, butto race with a counter flip so that the secondcounter is seen as one.6. There can be at most one reader persistingthrough such a race condition, as otherwise thesumwouldbetwoorgreater, whichwouldcausethe updater to take the slowpath.7. Butiftheraceoccursonthefastpath’sfirstreadof the counters, and then again on its secondread, there have to have been two counter flips.8. Because a given updater flips the counter onlyonce, and because the update-side lock preventsapairofupdatersfromconcurrentlyflippingthecounters, the only way that the fastpath codecan race with a flip twice is if the first updatercompletes.9. But the first updater will not complete untilafter all pre-existing readers have completed.10. Therefore, if the fastpath races with a counterflip twice in succession, all pre-existing readersmust have completed, so that it is safe to takethe fastpath.Of course, not all parallel algorithms have suchsimple proofs. In such cases, it may be necessary toenlist more capable tools.E.6.4 Alternative Approach: MoreCapable ToolsAlthough Promela and Spin are quite useful, muchmorecapabletoolsareavailable, particularlyforverifyinghardware. This means that if it is possibleto translate your algorithm to the hardware-designVHDL language, as it often will be for low-level parallelalgorithms, then it is possible to apply thesetools to your code (for example, this was done forthe first realtime RCU algorithm). However, suchtools can be quite expensive.Although the advent of commodity multiprocessingmight eventually result in powerful freesoftwaremodel-checkers featuring fancy state-spacereductioncapabilities,thisdoesnothelpmuchinthehere and now.As an aside, there are Spin features that supportapproximate searches that require fixed amounts ofmemory, however, I have never been able to bringmyself to trust approximations when verifying parallelalgorithms.Anotherapproachmightbetodivideandconquer.E.6.5 Alternative Approach: Divideand Conquer<strong>It</strong> is often possible to break down a larger parallelalgorithm into smaller pieces, which can then beproven separately. For example, a 10-billion-statemodel might be broken into a pair of 100,000-statemodels. Taking this approach not only makes iteasier for tools such as Promela to verify your algorithms,it can also make your algorithms easier tounderstand.E.7 Promela Parable: dynticksand Preemptable RCUIn early 2008, a preemptable variant of RCU wasaccepted into mainline Linux in support of realtimeworkloads, a variant similar to the RCU implementationsin the -rt patchset [Mol05] since August2005. Preemptable RCU is needed for real-timeworkloads because older RCU implementations disablepreemption across RCU read-side critical sections,resulting in excessive real-time latencies.However, one disadvantage of the older -rt implementation(described in Appendix D.4) was thateach grace period requires work to be done on eachCPU, even if that CPU is in a low-power “dynticksidle”state, and thus incapable of executing RCUread-side critical sections. The idea behind thedynticks-idle state is that idle CPUs should be physicallypowered down in order to conserve energy.In short, preemptable RCU can disable a valuableenergy-conservation feature of recent Linux kernels.Although JoshTriplett andPaulMcKenneyhaddiscussedsomeapproachesforallowingCPUstoremainin low-power state throughout an RCU grace period(thus preserving the Linux kernel’s ability to conserveenergy), matters did not come to a head untilSteve Rostedt integrated a new dyntick implementationwith preemptable RCU in the -rt patchset.This combination caused one of Steve’s systemsto hang on boot, so in October, Paul codedup a dynticks-friendly modification to preemptableRCU’s grace-period processing. Steve coded uprcu_irq_enter() and rcu_irq_exit() interfaces

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!