10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

40 CHAPTER 4. COUNTING1 unsigned long read_count(void)2 {3 int c;4 int cm;5 int old;6 int t;7 unsigned long sum;89 spin_lock(&gblcnt_mutex);10 sum = globalcount;11 for_each_thread(t)12 if (counterp[t] != NULL) {13 split_counterandmax(counterp[t], &old, &c, &cm);14 sum += c;15 }16 spin_unlock(&gblcnt_mutex);17 return sum;18 }Figure 4.17: Atomic Limit Counter Readever fail? After all, we picked up its old value online 9 and have not changed it!Lines 16-32 of Figure 4.16 show add_count()’sslowpath, which is protected by gblcnt_mutex,which is acquired on line 17 and released on lines 24and 30. Line 18 invokes globalize_count(), whichmoves this thread’s state to the global counters.Lines 19-20 check whether the delta value can beaccommodated by the current global state, and, ifnot, line 21 invokes flush_local_count() to flushall threads’ local state to the global counters, andthen lines 22-23 recheck whether delta can be accommodated.<strong>If</strong>, afterallthat, theadditionofdeltastill cannot be accommodated, then line 24 releasesgblcnt_mutex (as noted earlier), and then line 25returns failure.Otherwise, line 28 adds delta to the globalcounter, line 29 spreads counts to the local state ifappropriate, line 30 releases gblcnt_mutex (again,as noted earlier), and finally, line 31 returns success.Lines 34-63 of Figure 4.16 show sub_count(),which is structured similarly to add_count(), havinga fastpath on lines 41-48 and a slowpath onlines 49-62. A line-by-line analysis of this functionis left as an exercise to the reader.Figure 4.17 shows read_count(). Line 9 acquiresgblcnt_mutexandline16releasesit. Line10initializeslocal variable sum to the value of globalcount,and the loop spanning lines 11-15 adds the perthreadcounters to this sum, isolating each perthreadcounter using split_counterandmax on line13. Finally, line 17 returns the sum.Figure 4.18 shows the utility functionsglobalize_count(), flush_local_count(),balance_count(), count_register_thread(),and count_unregister_thread(). The code forglobalize_count() is shown on lines 1-12, and itis similar to that of previous algorithms, with the1 static void globalize_count(void)2 {3 int c;4 int cm;5 int old;67 split_counterandmax(&counterandmax, &old, &c, &cm);8 globalcount += c;9 globalreserve -= cm;10 old = merge_counterandmax(0, 0);11 atomic_set(&counterandmax, old);12 }1314 static void flush_local_count(void)15 {16 int c;17 int cm;18 int old;19 int t;20 int zero;2122 if (globalreserve == 0)23 return;24 zero = merge_counterandmax(0, 0);25 for_each_thread(t)26 if (counterp[t] != NULL) {27 old = atomic_xchg(counterp[t], zero);28 split_counterandmax_int(old, &c, &cm);29 globalcount += c;30 globalreserve -= cm;31 }32 }3334 static void balance_count(void)35 {36 int c;37 int cm;38 int old;39 unsigned long limit;4041 limit = globalcountmax - globalcount - globalreserve;42 limit /= num_online_threads();43 if (limit > MAX_COUNTERMAX)44 cm = MAX_COUNTERMAX;45 else46 cm = limit;47 globalreserve += cm;48 c = cm / 2;49 if (c > globalcount)50 c = globalcount;51 globalcount -= c;52 old = merge_counterandmax(c, cm);53 atomic_set(&counterandmax, old);54 }5556 void count_register_thread(void)57 {58 int idx = smp_thread_id();5960 spin_lock(&gblcnt_mutex);61 counterp[idx] = &counterandmax;62 spin_unlock(&gblcnt_mutex);63 }6465 void count_unregister_thread(int nthreadsexpected)66 {67 int idx = smp_thread_id();6869 spin_lock(&gblcnt_mutex);70 globalize_count();71 counterp[idx] = NULL;72 spin_unlock(&gblcnt_mutex);73 }Figure4.18: AtomicLimitCounterUtilityFunctions

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!