10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

64 CHAPTER 5. PARTITIONING AND SYNCHRONIZATION DESIGNing the array of pointers making up the pool field,andthenumberprecedingthemrepresentingthecurfield. The shaded boxes represent non-NULL pointers,while the empty boxes represent NULL pointers.An important, though potentially confusing, invariantof this data structure is that the cur field isalways one smaller than the number of non-NULLpointers.(Empty)−101231 struct memblock *memblock_alloc(void)2 {3 int i;4 struct memblock *p;5 struct percpumempool *pcpp;67 pcpp = &__get_thread_var(percpumem);8 if (pcpp->cur < 0) {9 spin_lock(&globalmem.mutex);10 for (i = 0; i < TARGET_POOL_SIZE &&11 globalmem.cur >= 0; i++) {12 pcpp->pool[i] = globalmem.pool[globalmem.cur];13 globalmem.pool[globalmem.cur--] = NULL;14 }15 pcpp->cur = i - 1;16 spin_unlock(&globalmem.mutex);17 }18 if (pcpp->cur >= 0) {19 p = pcpp->pool[pcpp->cur];20 pcpp->pool[pcpp->cur--] = NULL;21 return p;22 }23 return NULL;24 }Figure 5.30: Allocator-Cache Allocator Function45Figure 5.29: Allocator Pool Schematic5.4.4.4 Allocation FunctionThe allocation function memblock_alloc() may beseen in Figure 5.30. Line 7 picks up the currentthread’s per-thread pool, and line 8 check to see ifit is empty.<strong>If</strong> so, lines 9-16 attempt to refill it from the globalpool under the spinlock acquired on line 9 and releasedon line 16. Lines 10-14 move blocks from theglobal to the per-thread pool until either the localpool reaches its target size (half full) or the globalpool is exhausted, and line 15 sets the per-threadpool’s count to the proper value.In either case, line 18 checks for the per-threadpool still being empty, and if not, lines 19-21 removeablockandreturnit. Otherwise, line23tellsthesadtale of memory exhaustion.5.4.4.5 Free FunctionFigure 5.31 shows the memory-block free function.Line 6 gets a pointer to this thread’s pool, and line 7checks to see if this per-thread pool is full.<strong>If</strong> so, lines 8-15 empty half of the per-thread poolinto the global pool, with lines 8 and 14 acquiringandreleasingthespinlock. Lines9-12implementtheloop moving blocks from the local to the global pool,and line 13 sets the per-thread pool’s count to theproper value.In either case, line 16 then places the newly freedblock into the per-thread pool.1 void memblock_free(struct memblock *p)2 {3 int i;4 struct percpumempool *pcpp;56 pcpp = &__get_thread_var(percpumem);7 if (pcpp->cur >= 2 * TARGET_POOL_SIZE - 1) {8 spin_lock(&globalmem.mutex);9 for (i = pcpp->cur; i >= TARGET_POOL_SIZE; i--) {10 globalmem.pool[++globalmem.cur] = pcpp->pool[i];11 pcpp->pool[i] = NULL;12 }13 pcpp->cur = i;14 spin_unlock(&globalmem.mutex);15 }16 pcpp->pool[++pcpp->cur] = p;17 }Figure 5.31: Allocator-Cache Free Function5.4.4.6 PerformanceRough performance results 9 are shown in Figure5.32, running on a dual-core Intel x86 runningat 1GHz (4300 bogomips per CPU) with at most sixblocks allowed in each CPU’s cache. In this microbenchmark,each thread repeated allocates a group9 This data was not collected in a statistically meaningfulway, and therefore should be viewed with great skepticismand suspicion. Good data-collection and -reduction practiceis discussed in Chapter @@@. That said, repeated runs gavesimilar results, and these results match more careful evaluationsof similar algorithms.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!