11.07.2015 Views

The GPU Computing Revolution - London Mathematical Society

The GPU Computing Revolution - London Mathematical Society

The GPU Computing Revolution - London Mathematical Society

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4 THE <strong>GPU</strong> COMPUTING REVOLUTIONFrom Multi-Core CPUs To Many-Core Graphics ProcessorsFrom Multi-Core to Many-Core: Background and DevelopmentHistorically, advances in computerhardware have deliveredperformance improvements thathave been transparent to the enduser – software has run morequickly on newer hardware withouthaving to make any changes to thatsoftware. We have enjoyeddecades of ‘single thread’performance improvement, whichrelied on converting exponentialincreases in available transistors(known as Moore’s Law [83]), intoincreasing processor clock speedsand advances in microprocessorpipelining, instruction-levelparallelism and out-of-orderexecution. All these technologydevelopments are essentially‘under the hood’, with existingsoftware generally performingfaster on next generation hardwarewithout programmer intervention.Most of these developments havenow hit a wall.In 2003 it started to becomeapparent that several fundamentalphysical limitations were going tochange everything. Herb Sutternoticed this significant change inhis widely cited 2004 essay ‘<strong>The</strong>Free Lunch is Over’ [110]. Figure 1is taken from Sutter’s paper,updated in 2009, and shows howsingle thread performanceincreases, based on increasingclock speed and instruction levelparallelism improvements, couldonly have continued at the cost ofsignificant increases in processorpower consumption. Somethinghad to change — the free lunchwas over.<strong>The</strong> answer was, and is,parallelism. It transpires that, forvarious semiconductorphysics-related reasons, aprocessor containing two 3GHzcores will use significantly lessenergy (and thus dissipate lesswaste heat) than an alternativeprocessor containing a single 6GHzcore. Yet in theory the dual-core3GHz device can deliver the sameperformance as the single 6GHzcore (much more on this later).Today even laptop processorscontain at least two cores, withmost mainstream server CPUsalready containing four or six cores.Thus, due to the powerconsumption problem, increasingcore counts have replacedincreasing clock speeds as themain method of delivering greaterhardware performance. (For moredetails see Box 1 later.)Of course the important implicationof ‘the free lunch is over’ is thatmost existing software will not justgo faster when we increase thenumber of cores inside aprocessor, unless you are in thefortunate position that yourperformance-critical software isalready ‘multi-core aware’.So, hardware is suddenly anddramatically changing: it isdelivering ever more performance,but is now doing so through rapidly Figure 1: Graph of four key technology trends from ‘<strong>The</strong> Free Lunch is Over’: transistors per processor, clockspeed, power consumption and instruction level parallelism [110].

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!