10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

192 APPENDIX D. READ-COPY UPDATE IMPLEMENTATIONSstruct rcu_statestructrcu_nodestruct rcu_statercu_bhrcustructrcu_nodestructrcu_nodestructrcu_nodestructrcu_nodestructrcu_nodeCPU 63CPU 4095CPU 0CPU 4032structrcu_datastructrcu_dataFigure D.15: Hierarchical RCU State 4,096 CPUsstructrcu_datastructrcu_datastructures will acquire the upper rcu_node structure’slock, which is again a 64x reduction from thecontention level that would be experienced by ClassicRCU running on a 4,096-CPU system.Quick Quiz D.5: Wait a minute! With all thosenew locks, how do you avoid deadlock?Quick Quiz D.6: Why stop at a 64-times reduction?Why not go for a few orders of magnitudeinstead?Quick Quiz D.7: ButIdon’tcareaboutMcKenney’slame excuses in the answer to Quick Quiz 2!!!I want to get the number of CPUs contending on asingle lock down to something reasonable, like sixteenor so!!!The implementation maintains some per-CPUdata, such as lists of RCU callbacks, organizedinto rcu_data structures. In addition, rcu (as incall_rcu()) and rcu bh (as in call_rcu_bh())each maintain their own hierarchy, as shown in FigureD.16.Quick Quiz D.8: OK, so what is the story withthe colors?The next section discusses energy conservation.D.2.5 Towards a Greener RCU ImplementationAsnotedearlier, animportantgoalofthiseffortistoleave sleeping CPUs lie in order to promote energyconservation. In contrast, classic RCU will happilyawaken each and every sleeping CPU at least onceper grace period in some cases, which is suboptimalin the case where a small number of CPUs are busyFigure D.16: Hierarchical RCU State With BHdoing RCU updates and the majority of the CPUsare mostly idle. This situation occurs frequently insystems sized for peak loads, and we need to be ableto accommodate it gracefully. Furthermore, we needto fix a long-standing bug in Classic RCU where adynticks-idle CPU servicing an interrupt containinga long-running RCU read-side critical section willfail to prevent an RCU grace period from ending.Quick Quiz D.9: Given such an egregious bug,why does Linux run at all?This is accomplished by requiring that all CPUsmanipulate counters located in a per-CPU rcu_dynticks structure. Loosely speaking, these countershave even-numbered values when the correspondingCPU is in dynticks idle mode, and haveodd-numbered values otherwise. RCU thus needs towait for quiescent states only for those CPUs whosercu_dynticks counters are odd, and need not wakeup sleeping CPUs, whose counters will be even. Asshown in Figure D.17, each per-CPU rcu_dynticksstructure is shared by the “rcu” and “rcu bh” implementations.The following section presents a high-level view ofthe RCU state machine.D.2.6 State MachineAt a sufficiently high level, Linux-kernel RCU implementationscan be thought of as high-level state

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!