10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

D.4. PREEMPTABLE RCU 2391 void __rcu_read_lock(void)2 {3 int idx;4 struct task_struct *t = current;5 int nesting;67 nesting = ACCESS_ONCE(t->rcu_read_lock_nesting);8 if (nesting != 0) {9 t->rcu_read_lock_nesting = nesting + 1;10 } else {11 unsigned long flags;1213 local_irq_save(flags);14 idx = ACCESS_ONCE(rcu_ctrlblk.completed) & 0x1;15 ACCESS_ONCE(__get_cpu_var(rcu_flipctr)[idx])++;16 ACCESS_ONCE(t->rcu_read_lock_nesting) = nesting + 1;17 ACCESS_ONCE(t->rcu_flipctr_idx) = idx;18 local_irq_restore(flags);19 }20 }Figure D.73: rcu read lock() ImplementationThe __rcu_advance_callbacks() function,shown in Figure D.72, advances callbacks andacknowledges the counter flip. Line 7 checks to seeif the global rcu_ctrlblk.completed counter hasadvanced since the last call by the current CPU tothis function. <strong>If</strong> not, callbacks need not be advanced(lines 8-37). Otherwise, lines 8 through 37 advancecallbacks through the lists (while maintaining acount of the number of non-empty lists in thewlc variable). In either case, lines 38 through 43acknowledge the counter flip if needed.Quick Quiz D.58: How is it possible for lines 38-43 of __rcu_advance_callbacks() to be executedwhen lines 7-37 have not? Won’t they both be executedjust after a counter flip, and never at anyother time?D.4.2.4 Read-Side PrimitivesThis section examines the rcu_read_lock() andrcu_read_unlock() primitives, followed by a discussionof how this implementation deals with thefact that these two primitives do not contain memorybarriers.rcu read lock() The implementation of rcu_read_lock() is as shown in Figure D.73. Line 7fetches this task’s RCU read-side critical-sectionnesting counter. <strong>If</strong> line 8 finds that this counter isnon-zero, then we are already protected by an outerrcu_read_lock(), in which case line 9 simply incrementsthis counter.However, if this is the outermost rcu_read_lock(), then more work is required. Lines 13 and 18suppressandrestoreirqstoensurethattheinterveningcode is neither preempted nor interrupted by ascheduling-clock interrupt (which runs the grace periodstatemachine).Line14fetchesthegrace-periodcounter, line 15 increments the current counter forthis CPU, line 16 increments the nesting counter,and line 17 records the old/new counter index sothat rcu_read_unlock() can decrement the correspondingcounter (but on whatever CPU it ends uprunning on).The ACCESS_ONCE() macros force the compiler toemit the accesses in order. Although this does notprevent the CPU from reordering the accesses fromthe viewpoint of other CPUs, it does ensure thatNMI and SMI handlers running on this CPU will seethese accesses in order. This is critically important:1. In absence of the ACCESS_ONCE() in the assignmentto idx, the compiler would be withinits rights to: (a) eliminate the local variableidx and (b) compile the increment on line 16as a fetch-increment-store sequence, doing separateaccesses to rcu_ctrlblk.completed forthe fetch and the store. <strong>If</strong> the value of rcu_ctrlblk.completed had changed in the meantime,this would corrupt the rcu_flipctr values.2. <strong>If</strong> the assignment to rcu_read_lock_nesting(line 17) were to be reordered to precede theincrement of rcu_flipctr (line 16), and if anNMI occurred between these two events, thenan rcu_read_lock() in that NMI’s handlerwould incorrectly conclude that it was alreadyunder the protection of rcu_read_lock().3. <strong>If</strong> the assignment to rcu_read_lock_nesting(line 17) were to be reordered to follow the assignmentto rcu_flipctr_idx (line 18), andif an NMI occurred between these two events,then an rcu_read_lock() in that NMI’s handlerwould clobber rcu_flipctr_idx, possiblycausing the matching rcu_read_unlock() todecrement the wrong counter. This in turncould result in premature ending of a grace period,indefinite extension of a grace period, oreven both.<strong>It</strong> is not clear that the ACCESS_ONCE on the assignmentto nesting (line 7) is required. <strong>It</strong> is alsounclear whether the smp_read_barrier_depends()(line 15) is needed: it was added to ensure thatchanges to index and value remain ordered.The reasons that irqs must be disabled fromline 13 through line 19 are as follows:1. Suppose one CPU loaded rcu_ctrlblk.completed (line 14), then a second CPU incrementedthis counter, and then the first CPU

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!