10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

106 CHAPTER 8. DEFERRED PROCESSINGoverhead ranges from about 600 nanoseconds on asingle-CPU Power5 system up to more than 100 microsecondson a 64-CPU system.Quick Quiz 8.57: To be sure, the clock frequenciesof ca-2008 Power systems were quite high, buteven a 5GHz clock frequency is insufficent to allowloops to be executed in 50 picoseconds! <strong>What</strong> isgoing on here?However, this implementation requires that eachthread either invoke rcu_quiescent_state() periodicallyor to invoke rcu_thread_offline() for extendedquiescent states. The need to invoke thesefunctions periodically can make this implementationdifficult to use in some situations, such as for certaintypes of library functions.Quick Quiz 8.58: Why would the fact that thecode is in a library make any difference for how easyit is to use the RCU implementation shown in Figures8.44 and 8.45?Quick Quiz 8.59: But what if you hold a lockacross a call to synchronize_rcu(), and then acquirethat same lock within an RCU read-side criticalsection? This should be a deadlock, but howcan a primitive that generates absolutely no codepossibly participate in a deadlock cycle?In addition, this implementation does not permitconcurrent calls tosynchronize_rcu() to sharegrace periods. That said, one could easily imaginea production-quality RCU implementation based onthis version of RCU.8.3.4.10 Summary of Toy RCU Implementations<strong>If</strong> you made it this far, congratulations! <strong>You</strong> shouldnow have a much clearer understanding not only ofRCU itself, but also of the requirements of enclosingsoftware environments and applications. Thosewishing an even deeper understanding are invited toread Appendix D, which presents some RCU implementationsthat have seen extensive use in production.The preceding sections listed some desirable propertiesof the various RCU primitives. The followinglist is provided for easy reference for those wishingto create a new RCU implementation.1. There must be read-side primitives (such asrcu_read_lock() and rcu_read_unlock())and grace-period primitives (such assynchronize_rcu() and call_rcu()), suchthat any RCU read-side critical section inexistence at the start of a grace period hascompleted by the end of the grace period.2. RCU read-side primitives should have minimaloverhead. In particular, expensive operationssuch as cache misses, atomic instructions, memorybarriers, and branches should be avoided.3. RCU read-side primitives should have O(1)computational complexity to enable real-timeuse. (Thisimpliesthatreadersrunconcurrentlywith updaters.)4. RCU read-side primitives should be usable inall contexts (in the Linux kernel, they are permittedeverywhere except in the idle loop). Animportant special case is that RCU read-sideprimitives be usable within an RCU read-sidecritical section, in other words, that it be possibleto nest RCU read-side critical sections.5. RCU read-side primitives should be unconditional,with no failure returns. This propertyis extremely important, as failure checking increasescomplexity and complicates testing andvalidation.6. Anyoperationotherthanaquiescentstate(andthus a grace period) should be permitted in anRCU read-side critical section. In particular,non-idempotent operations such as I/O shouldbe permitted.7. <strong>It</strong> should be possible to update an RCUprotecteddata structure while executing withinan RCU read-side critical section.8. Both RCU read-side and update-side primitivesshould be independent of memory allocator designand implementation, in other words, thesame RCU implementation should be able toprotect a given data structure regardless of howthe data elements are allocated and freed.9. RCU grace periods should not be blockedby threads that halt outside of RCU readsidecritical sections. (But note that mostquiescent-state-based implementations violatethis desideratum.)Quick Quiz 8.60: Given that grace periods areprohibited within RCU read-side critical sections,how can an RCU data structure possibly be updatedwhile in an RCU read-side critical section?8.3.5 RCU ExercisesThis section is organized as a series of QuickQuizzes that invite you to apply RCU to a numberof examples earlier in this book. The answerto each Quick Quiz gives some hints, and

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!