10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

110 CHAPTER 9. APPLYING RCU1 struct countarray {2 unsigned long total;3 unsigned long *counterp[NR_THREADS];4 };56 long __thread counter = 0;7 struct countarray *countarrayp = NULL;8 DEFINE_SPINLOCK(final_mutex);910 void inc_count(void)11 {12 counter++;13 }1415 long read_count(void)16 {17 struct countarray *cap;18 unsigned long sum;19 int t;2021 rcu_read_lock();22 cap = rcu_dereference(countarrayp);23 sum = cap->total;24 for_each_thread(t)25 if (cap->counterp[t] != NULL)26 sum += *cap->counterp[t];27 rcu_read_unlock();28 return sum;29 }3031 void count_init(void)32 {33 countarrayp = malloc(sizeof(*countarrayp));34 if (countarrayp == NULL) {35 fprintf(stderr, "Out of memory\n");36 exit(-1);37 }38 bzero(countarrayp, sizeof(*countarrayp));39 }4041 void count_register_thread(void)42 {43 int idx = smp_thread_id();4445 spin_lock(&final_mutex);46 countarrayp->counterp[idx] = &counter;47 spin_unlock(&final_mutex);48 }4950 void count_unregister_thread(int nthreadsexpected)51 {52 struct countarray *cap;53 struct countarray *capold;54 int idx = smp_thread_id();5556 cap = malloc(sizeof(*countarrayp));57 if (cap == NULL) {58 fprintf(stderr, "Out of memory\n");59 exit(-1);60 }61 spin_lock(&final_mutex);62 *cap = *countarrayp;63 cap->total += counter;64 cap->counterp[idx] = NULL;65 capold = countarrayp;66 rcu_assign_pointer(countarrayp, cap);67 spin_unlock(&final_mutex);68 synchronize_rcu();69 free(capold);70 }Figure 9.1: RCU and Per-Thread Statistical Countersthe sum.The initial value for countarrayp is provided bycount_init() on lines 31-39. This function runsbefore the first thread is created, and its job is to allocateand zero the initial structure, and then assignit to countarrayp.Lines 41-48 show the count_register_thread()function, which is invoked by each newly createdthread. Line 43 picks up the current thread’s index,line 45 acquires final_mutex, line 46 installs apointer to this thread’scounter, and line 47 releasesfinal_mutex.Quick Quiz 9.3: Hey!!! Line 45 of Figure 9.1modifies a value in a pre-existing countarray structure!Didn’t you say that this structure, once madeavailable to read_count(), remained constant???Lines 50-70 shows count_unregister_thread(),which is invoked by each thread just before it exits.Lines 56-60 allocate a new countarray structure,line 61 acquires final_mutex and line 67 releasesit. Line 62 copies the contents of the currentcountarray into the newly allocated version, line 63adds the exiting thread’s counter to new structure’stotal, and line 64 NULLs the exiting thread’scounterp[] array element. Line 65 then retains apointer to the current (soon to be old) countarraystructure, and line 66 uses rcu_assign_pointer()to install the new version of the countarray structure.Line 68 waits for a grace period to elapse, sothat any threads that might be concurrently executingin read_count, and thus might have referencesto the old countarray structure, will be allowed toexit their RCU read-side critical sections, thus droppinganysuchreferences.Line69canthensafelyfreethe old countarray structure.9.1.3 DiscussionQuick Quiz 9.4: Wow!!! Figure 9.1 contains 69lines of code, compared to only 42 in Figure 4.8. <strong>Is</strong>this extra complexity really worth it?Use of RCU enables exiting threads to wait untilother threads are guaranteed to be done usingthe exiting threads’ __thread variables. This allowstheread_count()functiontodispensewithlocking,thereby providing excellent performance and scalabilityfor both the inc_count() and read_count()functions. However, thisperformanceandscalabilitycomeatthecostofsomeincreaseincodecomplexity.<strong>It</strong> is hoped that compiler and library writers employuser-level RCU [Des09] to provide safe cross-threadaccess to __thread variables, greatly reducing thecomplexity seen by users of __thread variables.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!