10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

F.4. CHAPTER 4: COUNTING 2851 long __thread counter = 0;2 long *counterp[NR_THREADS] = { NULL };3 int finalthreadcount = 0;4 DEFINE_SPINLOCK(final_mutex);56 void inc_count(void)7 {8 counter++;9 }1011 long read_count(void)12 {13 int t;14 long sum = 0;1516 for_each_thread(t)17 if (counterp[t] != NULL)18 sum += *counterp[t];19 return sum;20 }2122 void count_init(void)23 {24 }2526 void count_register_thread(void)27 {28 counterp[smp_thread_id()] = &counter;29 }3031 void count_unregister_thread(int nthreadsexpected)32 {33 spin_lock(&final_mutex);34 finalthreadcount++;35 spin_unlock(&final_mutex);36 while (finalthreadcount < nthreadsexpected)37 poll(NULL, 0, 1);38 }Figure F.2: Per-Thread Statistical Counters WithLockless Summationnot on any critical path. Now, if we were testing onmachines with thousands of CPUs, we might needto omit the lock, but on machines with “only” ahundred or so CPUs, no need to get fancy.situation in a much more graceful manner.Quick Quiz 4.23:<strong>What</strong> fundamental difference is there betweencounting packets and counting the total number ofbytes in the packets, given that the packets vary insize?Answer:When counting packets, the counter is only incrementedby the value one. On the other hand, whencounting bytes, the counter might be incrementedby largish numbers.Why does this matter? Because in the incrementby-onecase, the value returned will be exact inthe sense that the counter must necessarily havetaken on that value at some point in time, even ifit is impossible to say precisely when that point occurred.In contrast, when counting bytes, two differentthreads might return values that are inconsistentwith any global ordering of operations.To see this, suppose that thread 0 adds the valuethree to its counter, thread 1 adds the value five toits counter, and threads 2 and 3 sum the counters.<strong>If</strong> the system is “weakly ordered” or if the compileruses aggressive optimizations, thread 2 might findthe sum to be three and thread 3 might find thesum to be five. The only possible global orders ofthe sequence of values of the counter are 0,3,8 and0,5,8, and neither order is consistent with the resultsobtained.<strong>If</strong> you missed this one, you are not alone. MichaelScott used this question to stump Paul McKenneyduring Paul’s Ph.D. defense.Quick Quiz 4.22:Fine, but the Linux kernel doesn’t have to acquirea lock when reading out the aggregate value ofper-CPU counters. <strong>So</strong> why should user-space codeneed to do this???Answer:Remember, the Linux kernel’s per-CPU variablesare always accessible, even if the correspondingCPU is offline — even if the corresponding CPUnever existed and never will exist.One workaround is to ensure that each threadsticks around until all threads are finished, as shownin Figure F.2. Analysis of this code is left as anexercise to the reader, however, please note that itdoes not fit well into the counttorture.h counterevaluationscheme.(Whynot?) Chapter8willintroducesynchronization mechanisms that handle thisQuick Quiz 4.24:Given that the reader must sum all the threads’counters, this could take a long time given largenumbers of threads. <strong>Is</strong> there any way that theincrement operation can remain fast and scalablewhile allowing readers to also enjoy reasonableperformance and scalability?Answer:One approach would be to maintain a global approximationto the value. Readers would incrementtheir per-thread variable, but when it reachedsome predefined limit, atomically add it to a globalvariable, then zero their per-thread variable. Thiswould permit a tradeoff between average incrementoverhead and accuracy of the value read out.The reader is encouraged to think up and try out

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!