10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

D.3. HIERARCHICAL RCU CODE WALKTHROUGH 219ingthatrcu_start_gp()releasestherootrcu_nodestructure’s lock. Local variable rdp references therunning CPU’s rcu_data structure, rnp referencesthe root rcu_node structure, and rnp_cur and rnp_end are used as cursors in traversing the rcu_nodehierarchy.Line 10 invokes cpu_needs_another_gp() to seeif this CPU really needs another grace period to bestarted,andifnot,line11releasestherootrcu_nodestructure’s lock and line 12 returns. This code pathcan be executed due to multiple CPUs concurrentlyattempting to start a grace period. In this case, thewinner will start the grace period, and the losers willexit out via this code path.Otherwise, line 14 increments the specified rcu_state structure’s ->gpnum field, officially markingthe start of a new grace period.Quick Quiz D.43: But there has been no initializationyet at line 15 of Figure D.37! <strong>What</strong> happensif a CPU notices the new grace period and immediatelyattempts to report a quiescent state? Won’t itget confused?Line 15 sets the->signaled field toRCU_GP_INITin order to prevent any other CPU from attemptingto force an end to the new grace period before its initializationcompletes. Lines 16-18 schedule the nextattempt to force an end to the new grace period,first in terms of jiffies and second in terms of thenumber of calls to rcu_pending. Of course, if thegrace period ends naturally before that time, therewill be no need to attempt to force it. Line 20 invokesrecord_gp_stall_check_time() to schedulea longer-term progress check—if the grace period extendsbeyond this time, it should be considered tobe an error. Line 22 invokes note_new_gpnum() inorder to initialize this CPU’s rcu_data structure toaccount for the new grace period.Lines 23-26 advance all of this CPU’s callbacksso that they will be eligible to be invoked at theend of this new grace period. This represents anacceleration of callbacks, as other CPUs would onlybe able to move the RCU_NEXT_READY_TAIL batch tobe serviced by the current grace period; the RCU_NEXT_TAIL would instead need to be advanced tothe RCU_NEXT_READY_TAIL batch. The reason thatthis CPU can accelerate the RCU_NEXT_TAIL batchis that it knows exactly when this new grace periodstarted. In contrast, other CPUs would be unable tocorrectly resolve the race between the start of a newgrace period and the arrival of a new RCU callback.Line 27 checks to see if there is but one rcu_nodestructure in the hierarchy, and if so, line 28 sets the->qsmask bits corresponding to all online CPUs, inotherwords, correspondingtothoseCPUsthatmustpass through a quiescent state for the new grace periodto end. Line 29 releases the root rcu_nodestructure’s lock and line 30 returns. In this case,gcc’s dead-code elimination is expected to dispensewith lines 32-46.Otherwise, the rcu_node hierarchy has multiplestructures, requiring a more involved initializationscheme. Line 32 releases the root rcu_node structure’s lock, but keeps interrupts disabled,and then line 33 acquires the specified rcu_statestructure’s->onofflock, preventing any concurrentCPU-hotplug operations from manipulating RCUspecificstate.Line34setsthernp_endlocalvariabletoreferencethefirstleafrcu_nodestructure, whichalsohappensto be the rcu_node structure immediately followingthe last non-leaf rcu_node structure in the ->nodearray. Line 35 sets the rnp_cur local variable toreference the root rcu_node structure, which alsohappens to be first such structure in the ->node array.Lines 36 and 37 then traverse all of the non-leafrcu_node structures, setting the bits correspondingto lower-level rcu_node structures that have CPUsthat must pass through quiescent states in order forthe new grace period to end.Quick Quiz D.44: Hey!!! Shouldn’t we hold thenon-leaf rcu_node structures’ locks when mungingtheir state in line 37 of Figure D.37???Line 38 sets local variablernp_end to one past thelast leaf rcu_node structure, and line 39 sets localvariable rnp_cur to the first leaf rcu_node structure,so that the loop spanning lines 40-44 traversesall leaves of the rcu_node hierarchy. During eachpass through this loop, line 41 acquires the currentleaf rcu_node structure’s lock, line 42 sets the bitscorresponding to online CPUs (each of which mustpass through a quiescent state before the new graceperiod can end), and line 43 releases the lock.Quick Quiz D.45: Why can’t we merge the loopspanning lines 36-37 with the loop spanning lines 40-44 in Figure D.37?Line 45 then sets the specified rcu_state structure’s->signaled field to permit forcing of quiescentstates, and line 46 releases the ->onofflock topermit CPU-hotplug operations to manipulate RCUstate.D.3.6.4 Reporting Quiescent StatesThishierarchicalRCUimplementationimplementsalayered approach to reporting quiescent states, usingthe following functions:1. rcu_qsctr_inc() and rcu_bh_qsctr_inc()are invoked when a given CPU passes through a

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!