10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

D.2. HIERARCHICAL RCU OVERVIEW 199CONFIG_SMP=yCONFIG_NO_HZ=nCONFIG_RCU_CPU_STALL_DETECTOR=nCONFIG_HOTPLUG_CPU=nCONFIG_RCU_TRACE=yCONFIG_PREEMPT_RCU=nCONFIG_CLASSIC_RCU=nCONFIG_TREE_RCU=y10. Disable SMP, CPU-stall detection, dyntick idlemode, and CPU hotplug:CONFIG_SMP=nCONFIG_NO_HZ=nCONFIG_RCU_CPU_STALL_DETECTOR=nCONFIG_HOTPLUG_CPU=nCONFIG_RCU_TRACE=yCONFIG_PREEMPT_RCU=nCONFIG_CLASSIC_RCU=nCONFIG_TREE_RCU=yThis combination located a number of compilerwarnings.11. Disable SMP and CPU hotplug:CONFIG_SMP=nCONFIG_NO_HZ=yCONFIG_RCU_CPU_STALL_DETECTOR=yCONFIG_HOTPLUG_CPU=nCONFIG_RCU_TRACE=yCONFIG_PREEMPT_RCU=nCONFIG_CLASSIC_RCU=nCONFIG_TREE_RCU=y12. Test Classic RCU with dynticks idle but withoutpreemption:CONFIG_NO_HZ=yCONFIG_PREEMPT=nCONFIG_RCU_TRACE=yCONFIG_PREEMPT_RCU=nCONFIG_CLASSIC_RCU=yCONFIG_TREE_RCU=n13. Test Classic RCU with preemption but withoutdynticks idle:CONFIG_NO_HZ=nCONFIG_PREEMPT=yCONFIG_RCU_TRACE=yCONFIG_PREEMPT_RCU=nCONFIG_CLASSIC_RCU=yCONFIG_TREE_RCU=n14. Test Preemptable RCU with dynticks idle:CONFIG_NO_HZ=yCONFIG_PREEMPT=yCONFIG_RCU_TRACE=yCONFIG_PREEMPT_RCU=yCONFIG_CLASSIC_RCU=nCONFIG_TREE_RCU=n15. Test Preemptable RCU without dynticks idle:CONFIG_NO_HZ=nCONFIG_PREEMPT=yCONFIG_RCU_TRACE=yCONFIG_PREEMPT_RCU=yCONFIG_CLASSIC_RCU=nCONFIG_TREE_RCU=nForalargechangethataffectsRCUcorecode, oneshould run rcutorture for each of the above combinations,and concurrently with CPU offlining andonlining for cases with CONFIG_HOTPLUG_CPU. Forsmall changes, it may suffice to run kernbench ineach case. Of course, if the change is confined to aparticular subset of the configuration parameters, itmay be possible to reduce the number of test cases.Torturing software: the Geneva Convention doesnot (yet) prohibit it, and I strongly recommend it!!!D.2.9 ConclusionThis hierarchical implementation of RCU reduceslock contention, avoids unnecessarily awakeningdyntick-idle sleeping CPUs, while helping to debugLinux’s hotplug-CPU code paths. This implementationis designed to handle single systems with thousandsof CPUs, and on 64-bit systems has an architecturallimitation of a quarter million CPUs, alimit I expect to be sufficient for at least the nextfew years.This RCU implementation of course has some limitations:1. The force_quiescent_state() can scan thefull set of CPUs with irqs disabled. This wouldbe fatal in a real-time implementation of RCU,so if hierarchy ever needs to be introduced topreemptable RCU, some other approach will berequired. <strong>It</strong> is possible that it will be problematicon 4,096-CPU systems, but actual testingon such systems is required to prove this oneway or the other.On busy systems, the force_quiescent_state() scan would not be expected to happen,as CPUs should pass through quiescent stateswithin three jiffies of the start of a quiescentstate. On semi-busy systems, only the CPUs in

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!