10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

76 CHAPTER 8. DEFERRED PROCESSING• void smp mb before atomic dec(void); <strong>Is</strong>suesamemorybarrieranddisablescode-motioncompiler optimizations only if the platform’satomic_dec() primitive does not already do so.• struct rcu head AdatastructureusedbytheRCU infrastructure to track objects awaiting agrace period. This is normally included as afield within an RCU-protected data structure.8.2.3 Counter OptimizationsIn some cases where increments and decrements arecommon, but checks for zero are rare, it makes senseto maintain per-CPU or per-task counters, as wasdiscussed in Chapter 4. See Appendix D.1 for anexample of this technique applied to RCU. This approacheliminates the need for atomic instructionsor memory barriers on the increment and decrementprimitives, but still requires that code-motion compileroptimizations be disabled. In addition, theprimitives such as synchronize_srcu() that checkfor the aggregate reference count reaching zero canbe quite slow. This underscores the fact that thesetechniques are designed for situations where the referencesare frequently acquired and released, butwhere it is rarely necessary to check for a zero referencecount.8.3 Read-Copy Update (RCU)8.3.1 RCU FundamentalsAuthors: Paul E. McKenney and Jonathan WalpoleRead-copy update (RCU) is a synchronizationmechanism that was added to the Linux kernel inOctober of 2002. RCU achieves scalability improvementsby allowing reads to occur concurrently withupdates. Incontrastwithconventionallockingprimitivesthat ensure mutual exclusion among concurrentthreads regardless of whether they be readersor updaters, or with reader-writer locks that allowconcurrent reads but not in the presence of updates,RCU supports concurrency between a singleupdater and multiple readers. RCU ensures thatreads are coherent by maintaining multiple versionsof objects and ensuring that they are not freed upuntil all pre-existing read-side critical sections complete.RCU defines and uses efficient and scalablemechanisms for publishing and reading new versionsof an object, and also for deferring the collectionof old versions. These mechanisms distribute thework among read and update paths in such a wayas to make read paths extremely fast. In some cases1 struct foo {2 int a;3 int b;4 int c;5 };6 struct foo *gp = NULL;78 /* . . . */910 p = kmalloc(sizeof(*p), GFP_KERNEL);11 p->a = 1;12 p->b = 2;13 p->c = 3;14 gp = p;Figure 8.5: Data Structure Publication (Unsafe)(non-preemptable kernels), RCU’s read-side primitiveshave zero overhead.Quick Quiz 8.6: But doesn’t seqlock also permitreadersandupdaterstogetworkdoneconcurrently?This leads to the question “what exactly isRCU?”, and perhaps also to the question “how canRCU possibly work?” (or, not infrequently, the assertionthat RCU cannot possibly work). This documentaddresses these questions from a fundamentalviewpoint; later installments look at them from usageand from API viewpoints. This last installmentalso includes a list of references.RCU is made up of three fundamental mechanisms,the first being used for insertion, the secondbeing used for deletion, and the third beingused to allow readers to tolerate concurrent insertionsand deletions. Section 8.3.1.1 describes thepublish-subscribemechanismusedforinsertion, Section8.3.1.2 describes how waiting for pre-existingRCU readers enabled deletion, and Section 8.3.1.3discusses how maintaining multiple versions of recentlyupdatedobjectspermitsconcurrentinsertionsand deletions. Finally, Section 8.3.1.4 summarizesRCU fundamentals.8.3.1.1 Publish-Subscribe MechanismOne key attribute of RCU is the ability to safelyscan data, even though that data is being modifiedconcurrently. To provide this ability for concurrentinsertion, RCU uses what can be thoughtof as a publish-subscribe mechanism. For example,consider an initially NULL global pointer gp that isto be modified to point to a newly allocated and initializeddata structure. The code fragment shown inFigure 8.5 (with the addition of appropriate locking)might be used for this purpose.Unfortunately, there is nothing forcing the compilerand CPU to execute the last four assignmentstatements in order. <strong>If</strong> the assignment to gp hap-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!