10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

331before the grace period ends. Many RCU implementationsdefine a grace period to be a timeinterval during which each thread has passedthroughatleastonequiescentstate. SinceRCUread-side critical sections by definition cannotcontain quiescent states, these two definitionsare almost always interchangeable.Hot Spot: Datastructurethatisveryheavilyused,resulting in high levels of contention on the correspondinglock. One example of this situationwouldbeahashtablewithapoorlychosenhashfunction.Invalidation: When a CPU wishes to write to adata item, it must first ensure that this dataitem is not present in any other CPUs’ cache.<strong>If</strong> necessary, the item is removed from the otherCPUs’ caches via “invalidation” messages fromthe writing CPUs to any CPUs having a copyin their caches.IPI: Inter-processorinterrupt, whichisaninterruptsent from one CPU to another. IP<strong>Is</strong> are usedheavily in the Linux kernel, for example, withinthe scheduler to alert CPUs that a high-priorityprocess is now runnable.IRQ: Interrupt request, often used as an abbreviationfor “interrupt” within the Linux kernelcommunity, as in “irq handler”.Linearizable: A sequence of operations is “linearizable”if there is at least one global orderingof the sequence that is consistent with the observationsof all CPUs/threads.Lock: A software abstraction that can be used toguard critical sections, as such, an example ofa ”mutual exclusion mechanism”. An “exclusivelock” permits only one thread at a timeinto the set of critical sections guarded by thatlock, while a “reader-writer lock” permits anynumber of reading threads, or but one writingthread, into the set of critical sections guardedby that lock. (Just to be clear, the presence ofa writer thread in any of a given reader-writerlock’s critical sections will prevent any readerfrom entering any of that lock’s critical sectionsand vice versa.)Lock Contention: A lock is said to be sufferingcontention when it is being used so heavily thatthere is often a CPU waiting on it. Reducinglock contention is often a concern when designingparallel algorithms and when implementingparallel programs.Memory Consistency: A set of properties thatimpose constraints on the order in which accessesto groups of variables appear to occur.Memory consistency models range from sequentialconsistency, a very constraining model popularin academic circles, through process consistency,release consistency, and weak consistency.MESI Protocol: The cache-coherence protocolfeaturing modified, exclusive, shared, and invalid(MESI) states, so that this protocol isnamed after the states that the cache lines ina given cache can take on. A modified line hasbeen recently written to by this CPU, and isthe sole representative of the current value ofthe corresponding memory location. An exclusivecache line has not been written to, but thisCPU has the right to write to it at any time,as the line is guaranteed not to be replicatedinto any other CPU’s cache (though the correspondinglocation in main memory is up todate). Asharedcachelineis(ormightbe)replicatedin some other CPUs’ cache, meaning thatthis CPU must interact with those other CPUsbefore writing to this cache line. An invalidcache line contains no value, instead representing“empty space” in the cache into which datafrom memory might be loaded.Mutual-Exclusion Mechanism: A software abstractionthat regulates threads’ access to “criticalsections” and corresponding data.NMI: Non-maskable interrupt. As the name indicates,this is an extremely high-priority interruptthat cannot be masked. These are usedfor hardware-specific purposes such as profiling.The advantage of using NM<strong>Is</strong> for profiling isthat it allows you to profile code that runs withinterrupts disabled.NUCA: Non-uniform cache architecture, wheregroups of CPUs share caches. CPUs in a groupcan therefore exchange cache lines with eachother much more quickly than they can withCPUs in other groups. Systems comprised ofCPUswithhardwarethreadswillgenerallyhavea NUCA architecture.NUMA: Non-uniform memory architecture, wherememory is split into banks and each such bankis “close” to a group of CPUs, the group beingtermed a “NUMA node”. An example NUMAmachine is Sequent’s NUMA-Q system, whereeach group of four CPUs had a bank of memory

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!