10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

332 APPENDIX G. GLOSSARYnear by. The CPUs in a given group can accesstheir memory much more quickly than anothergroup’s memory.NUMA Node: A group of closely placed CPUsand associated memory within a larger NUMAmachines. Note that a NUMA node might wellhave a NUCA architecture.Pipelined CPU: A CPU with a pipeline, which isan internal flow of instructions internal to theCPU that is in some way similar to an assemblyline, with many of the same advantages anddisadvantages. In the 1960s through the early1980s, pipelined CPUs were the province of supercomputers,but started appearing in microprocessors(such as the 80486) in the late 1980s.Process Consistency: A memory-consistencymodel in which each CPU’s stores appear tooccur in program order, but in which differentCPUs might see accesses from more than oneCPU as occurring in different orders.Program Order: The order in which a giventhread’s instructions would be executed by anow-mythical “in-order” CPU that completelyexecuted each instruction before proceeding tothe next instruction. (The reason such CPUsarenowthestuffofancientmythsandlegendsisthat they were extremely slow. These dinosaurswere one of the many victims of Moore’s-Lawdrivenincreases in CPU clock frequency. <strong>So</strong>meclaim that these beasts will roam the earth onceagain, others vehemently disagree.)Quiescent State: In RCU, a point in the codewhere there can be no references held to RCUprotecteddata structures, which is normallyany point outside of an RCU read-side criticalsection. Any interval of time during whichall threads pass through at least one quiescentstate each is termed a “grace period”.Read-Copy Update (RCU): A synchronizationmechanism that can be thought of as a replacementfor reader-writer locking or referencecounting. RCU provides extremely lowoverheadaccess for readers, while writers incuradditionaloverheadmaintainingoldversionsforthe benefit of pre-existing readers. Readers neitherblock nor spin, and thus cannot participatein deadlocks, however, they also can seestale data and can run concurrently with updates.RCU is thus best-suited for read-mostlysituations where stale data can either be tolerated(as in routing tablees) or avoided (as in theLinux kernel’s System V IPC implementation).Read-Side Critical Section: A section of codeguarded by read-acquisition of some readerwritersynchronization mechanism. For example,if one set of critical sections are guardedby read-acquisition of a given global readerwriterlock, while a second set of critical sectionare guarded by write-acquisition of that samereader-writer lock, then the first set of criticalsections will be the read-side critical sectionsfor that lock. Any number of threads may concurrentlyexecute the read-side critical sections,but only if no thread is executing one of thewrite-side critical sections.Reader-Writer Lock: A reader-writer lock is amutual-exclusion mechanism that permits anynumber of reading threads, or but one writingthread, into the set of critical sections guardedby that lock. Threads attempting to write mustwait until all pre-existing reading threads releasethe lock, and, similarly, if there is a preexistingwriter, any threads attempting to writemust wait for the writer to release the lock.A key concern for reader-writer locks is “fairness”:can an unending stream of readers starvea writer or vice versa.Sequential Consistency: A memory-consistencymodel where all memory references appear tooccur in an order consistent with a single globalorder, and where each CPU’s memory referencesappear to all CPUs to occur in programorder.Store Buffer: A small set of internal registers usedby a given CPU to record pending stores whilethe corresponding cache lines are making theirway to that CPU. Also called “store queue”.Store Forwarding: AnarrangementwhereagivenCPU refers to its store buffer as well as its cacheso as to ensure that the software sees the memoryoperationsperformedbythisCPUasiftheywere carried out in program order.Super-Scalar CPU: A scalar (non-vector) CPUcapable of executing multiple instructions concurrently.This is a step up from a pipelinedCPU that executes multiple instructions in anassembly-line fashion — in a super-scalar CPU,each stage of the pipeline would be capable of

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!