10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

318 APPENDIX F. ANSWERS TO QUICK QUIZZESis waiting on a quiescent state from an offline CPU.Quick Quiz D.26:<strong>So</strong> what guards the earlier fields in this structure?__get_cpu_var(rcu_bh_data)). Using the->rda[] array of whichever rcu_state structurewe were passed works correctly regardless of whichAPI __call_rcu() was invoked from (suggested byLai Jiangshan [Jia08]).Answer:Nothing does, as they are constants set at compiletime or boot time. Of course, the fields internal toeach rcu_node in the ->node array may change,but they are guarded separately.Quick Quiz D.27:I thought that RCU read-side processing wassupposed to be fast!!! The functions shown inFigure D.21 have so much junk in them that theyjust have to be slow!!! <strong>What</strong> gives here?Answer:Appearances can be deceiving. Thepreempt_disable(), preempt_enable(),local_bh_disable(), and local_bh_enable()each do a single non-atomic manipulation of localdata. Even that assumes CONFIG_PREEMPT,otherwise, the preempt_disable() andpreempt_enable() functions emit no code, noteven compiler directives. The __acquire() and__release() functions emit no code (not even compilerdirectives), but are instead used by the sparsesemantic-parsing bug-finding program. Finally,rcu_read_acquire() and rcu_read_release()emit no code (not even compiler directives) unlessthe “lockdep” lock-order debugging facility is enabled,in which case they can indeed be somewhatexpensive.In short, unless you are a kernel hacker who hasenabled debugging options, these functions are extremelycheap, and in some cases, absolutely freeof overhead. <strong>And</strong>, in the words of a Portland-areafurniture retailer, “free is a very good price”.Quick Quiz D.28:Why not simply use __get_cpu_var() to pick up areference to the current CPU’s rcu_data structureon line 13 in Figure D.22?Answer:Because we might be called either fromcall_rcu() (in which case we wouldneed __get_cpu_var(rcu_data)) or fromcall_rcu_bh() (in which case we would needQuick Quiz D.29:Given that rcu_pending() is always called twice onlines 29-32 of Figure D.23, shouldn’t there be someway to combine the checks of the two structures?Answer:<strong>So</strong>rry, but this was a trick question. The Clanguage’s short-circuit boolean expression evaluationmeans that __rcu_pending() is invokedon rcu_bh_state only if the prior invocation onrcu_state returns zero.The reason the two calls are in this order is that“rcu” is used more heavily than is “rcu bh”, so thefirst call is more likely to return non-zero than is thesecond.Quick Quiz D.30:Shouldn’t line 42 of Figure D.23 also check forin_hardirq()?Answer:No. The rcu_read_lock_bh() primitive disablessoftirq, not hardirq. Because call_rcu_bh()need only wait for pre-existing “rcu bh” read-sidecritical sections to complete, we need only checkin_softirq().Quick Quiz D.31:Butdon’twealsoneedtocheckthatagraceperiodisactually in progress in __rcu_process_callbacksin Figure D.24?Answer:Indeed we do! <strong>And</strong> the first thing thatforce_quiescent_state() does is to performexactly that check.Quick Quiz D.32:<strong>What</strong> happens if two CPUs attempt to start a newgrace period concurrently in Figure D.24?Answer:One of the CPUs will be the first to acquire the rootrcu_node structure’s lock, and that CPU will start

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!