10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

316 APPENDIX F. ANSWERS TO QUICK QUIZZESQuick Quiz D.14:But what if all the CPUs end up in dyntick-idlemode? Wouldn’t that prevent the current RCUgrace period from ever ending?Answer:Indeed it will! However, CPUs that have RCUcallbacks are not permitted to enter dyntick-idlemode, so the only way that all the CPUs couldpossibly end up in dyntick-idle mode would beif there were absolutely no RCU callbacks in thesystem. <strong>And</strong> if there are no RCU callbacks in thesystem, then there is no need for the RCU graceperiod to end. In fact, there is no need for the RCUgrace period to even start.RCU will restart if some irq handler does a call_rcu(), which will cause an RCU callback to appearon the corresponding CPU, which will force thatCPU out of dyntick-idle mode, which will in turnpermit the current RCU grace period to come to anend.Quick Quiz D.15:Given that force_quiescent_state() is a threephasestate machine, don’t we have triple thescheduling latency due to scanning all the CPUs?Answer:Ah, but the three phases will not execute backto-backon the same CPU, and, furthermore, thefirst (initialization) phase doesn’t do any scanning.Therefore, the scheduling-latency hit of thethree-phase algorithm is no different than thatof a single-phase algorithm. <strong>If</strong> the schedulinglatency becomes a problem, one approach wouldbe to recode the state machine to scan the CPUsincrementally, most likely by keeping state ona per-leaf-rcu_node basis. But first show me aproblem in the real world, then I will consider fixingit!Quick Quiz D.16:But the other reason to hold ->onofflock is toprevent multiple concurrent online/offline operations,right?Answer:Actually, no! The CPU-hotplug code’s synchronizationdesign prevents multiple concurrent CPUonline/offline operations, so only one CPU online/offlineoperation can be executing at any giventime. Therefore, the only purpose of ->onofflockis to prevent a CPU online or offline operationfrom running concurrently with grace-periodinitialization.Quick Quiz D.17:Given all these acquisitions of the global->onofflock, won’t there be horrible lock contentionwhen running with thousands of CPUs?Answer:Actually, there can be only three acquisitions of thislock per grace period, and each grace period lastsmany milliseconds. One of the acquisitions is bythe CPU initializing for the current grace period,and the other two onlining and offlining some CPU.These latter two cannot run concurrently due tothe CPU-hotplug locking, so at most two CPUs canbe contending for this lock at any given time.Lock contention on->onofflock should thereforebe no problem, even on systems with thousands ofCPUs.Quick Quiz D.18:Why not simplify the code by merging the detectionof dyntick-idle CPUs with that of offline CPUs?Answer:<strong>It</strong> might well be that such merging may eventuallybe the right thing to do. In the meantime, however,there are some challenges:1. CPUs are not allowed to go into dyntick-idlemode while they have RCU callbacks pending,but CPUs are allowed to go offline with callbackspending. This means that CPUs goingoffline need to have their callbacks migrated tosome other CPU, thus, we cannot allow CPUsto simply go quietly offline.2. Present-day Linux systems run with NR_CPUSmuch larger than the actual number of CPUs.A unified approach could thus end up uselesslywaiting on CPUs that are not just offline, butwhich never existed in the first place.3. RCU is already operational when CPUs get onlinedone at a time during boot, and thereforemust handle the online process. This onliningmust exclude grace-period initialization, so the->onofflock must still be used.4. CPUs often switch into and out of dyntick-idlemode extremely frequently, so it is not reason-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!