10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

D.3. HIERARCHICAL RCU CODE WALKTHROUGH 2211 static void2 cpu_quiet(int cpu, struct rcu_state *rsp,3 struct rcu_data *rdp, long lastcomp)4 {5 unsigned long flags;6 unsigned long mask;7 struct rcu_node *rnp;89 rnp = rdp->mynode;10 spin_lock_irqsave(&rnp->lock, flags);11 if (lastcomp != ACCESS_ONCE(rsp->completed)) {12 rdp->passed_quiesc = 0;13 spin_unlock_irqrestore(&rnp->lock, flags);14 return;15 }16 mask = rdp->grpmask;17 if ((rnp->qsmask & mask) == 0) {18 spin_unlock_irqrestore(&rnp->lock, flags);19 } else {20 rdp->qs_pending = 0;21 rdp = rsp->rda[smp_processor_id()];22 rdp->nxttail[RCU_NEXT_READY_TAIL] =23 rdp->nxttail[RCU_NEXT_TAIL];24 cpu_quiet_msk(mask, rsp, rnp, flags);25 }26 }Figure D.40: Code for cpu quiet()Otherwise, line 16 forms a mask with the specifiedCPU’s bit set. Line 17 checks to see if this bit is stillsetintheleafrcu_nodestructure, and, ifnot, line18releases the lock and re-enables interrupts.On the other hand, if the CPU’s bit is still set,line 20 clears ->qs_pending, reflecting that thisCPU has passed through its quiescent state forthis grace period. Line 21 then overwrites localvariable rdp with a pointer to the running CPU’srcu_datastructure,andlines22-23updatestherunningCPU’s RCU callbacks so that all those not yetassociatedwithaspecificgraceperiodbeservicedbythe next grace period. Finally, line 24 clears bits upthercu_nodehierarchy, endingthecurrentgraceperiodif appropriate and perhaps even starting a newone. Note that cpu_quiet() releases the lock andre-enables interrupts.Quick Quiz D.47: How do lines 22-23 of FigureD.40 know that it is safe to promote the runningCPU’s RCU callbacks?Figure D.41 shows cpu_quiet_msk(), which updatesthe rcu_node hierarchy to reflect the passageof the CPUs indicated by argument mask throughtheir respective quiescent states. Note that argumentrnp is the leaf rcu_node structure correspondingto the specified CPUs.Quick Quiz D.48: Given that argument maskon line 2 of Figure D.41 is an unsigned long, howcan it possibly deal with systems with more than 64CPUs?Line 4 is annotation for the sparse utility, indicatingthat cpu_quiet_msk() releases the leaf1 static void2 cpu_quiet_msk(unsigned long mask, struct rcu_state *rsp,3 struct rcu_node *rnp, unsigned long flags)4 __releases(rnp->lock)5 {6 for (;;) {7 if (!(rnp->qsmask & mask)) {8 spin_unlock_irqrestore(&rnp->lock, flags);9 return;10 }11 rnp->qsmask &= ~mask;12 if (rnp->qsmask != 0) {13 spin_unlock_irqrestore(&rnp->lock, flags);14 return;15 }16 mask = rnp->grpmask;17 if (rnp->parent == NULL) {18 break;19 }20 spin_unlock_irqrestore(&rnp->lock, flags);21 rnp = rnp->parent;22 spin_lock_irqsave(&rnp->lock, flags);23 }24 rsp->completed = rsp->gpnum;25 rcu_process_gp_end(rsp, rsp->rda[smp_processor_id()]);26 rcu_start_gp(rsp, flags);27 }Figure D.41: Code for cpu quiet msk()rcu_node structure’s lock.Each pass through the loop spanning lines 6-23does the required processing for one level of thercu_node hierarchy, traversing the data structuresas shown by the blue arrow in Figure D.42.Line 7 checks to see if all of the bits in mask havealready been cleared in the current rcu_node structure’s->qsmask field, and, if so, line 8 releases thelock and re-enables interrupts, and line 9 returns tothe caller. <strong>If</strong> not, line 11 clears the bits specified bymask from the current rcu_node structure’s qsmaskfield. Line 12 then checks to see if there are morebits remaining in ->qsmask, and, if so, line 13 releasesthe lock and re-enables interrupts, and line 14returns to the caller.Otherwise, it is necessary to advance up to thenext level of the rcu_node hierarchy. In preparationfor this next level, line 16 places a maskwith the single bit set corresponding to the currentrcu_node structure within its parent. Line 17checks to see if there in fact is a parent for the currentrcu_node structure, and, if not, line 18 breaksfrom the loop. On the other hand, if there is a parentrcu_node structure, line 20 releases the currentrcu_node structure’s lock, line 21 advances the rnplocal variable to the parent, and line 22 acquires theparent’s lock. Execution then continues at the beginningof the loop on line 7.<strong>If</strong> line 18 breaks from the loop, we know that thecurrent grace period has ended, as the only way thatall bits can be cleared in the root rcu_node structureis if all CPUs have passed through quiescent

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!