The GPU Computing Revolution - London Mathematical Society

More documents

Recommendations

Info

18 THE GPU COMPUTING REVOLUTIONFrom Multi-Core CPUs To Many-Core Graphics ProcessorsCurrent ChallengesLike all breakthroughs intechnology, the change frommulti-core to many-core computerarchitectures will not be smooth foreveryone. There are significantchallenges during the transition,some of which are outlined below.1. Porting code to massivelyparallel heterogeneoussystems is often (but notalways) harder than ports tonew hardware have been inthe past. Often completelynew algorithms are required.2. Many-core technologies arestill relatively new, withimplications for the maturityof software tools, the lack ofsoftware developers with theright skills and experience,and the paucity of portedapplication software andlibraries.3. Even though cross-platformprogramming languages suchas OpenCL are nowemerging, these have so farfocused on source codeportability and cannotguarantee performanceportability. This is of coursenot a new issue; any highlyoptimised code written in amainstream language suchas C, C++ or Fortran hasperformance portabilityissues between differentarchitectures. However,differences between GPUarchitectures are evengreater than those betweenCPU architectures, and soperformance portability is setto become a greaterchallenge in the future.4. There are multiple competingopen and de facto standardswhich inevitably confuse thesituation for potentialadopters.5. Many current GPGPUproducts still carry some oftheir consumer graphicsheritage, including the lack ofimportant hardware reliabilityfeatures such as ErrorCorrecting Codes (ECC) ontheir memories. Even wherethese features do exist, theycurrently incur prohibitiveperformance penalties thatare not present in thecorresponding CPUsolutions.6. There is a lot of hype aroundGPU computing, with manyover-inflated claims ofperformance speedups of100 times or more. Theseclaims increase the risk ofsetting expectations too high,with subsequentdisappointment from trialprojects.7. The lack of industry standardbenchmarks makes it difficultfor users to comparecompeting many-coreproducts simply andaccurately.Of all these challenges, the mostfundamental is the design anddevelopment of new algorithms thatwill naturally lend themselves to themassive parallelism of GPUs today,and to the ubiquitousheterogeneous multi-/many-coresystems of tomorrow. If as acommunity we can design adaptive,highly scalable algorithms, ideallywith heterogeneity and evenfault-tolerance in mind, we will bewell placed to exploit the rapiddevelopment of parallelarchitectures over the next twodecades.
A KNOWLEDGE TRANSFER REPORT FROM THE LMSAND THE KTN FOR INDUSTRIAL MATHEMATICS19Next StepsAudit your softwareOne valuable practical step we caneach take is to perform an audit ofthe software we use that isperformance-critical to our work.For software developed by thirdparties, find out what their policy istowards supporting many-coreprocessors such as GPUs. Is theirsoftware already parallel? If so,how scalable is it? Does it runeffectively on a quad-core CPUtoday? What about the emerging 8,12 and 16 core CPUs? Do theyhave a demonstration of theirsoftware accelerated on a GPU?What is their roadmap for thesoftware? The software licensingmodel is something that you alsoneed to be conscious of — is thesoftware licensed per core,processor, node, user, . . . ? Willyou have to pay more for anupgrade that supports many-coreprocessors? Some vendors willsupply these features as no-costupgrades; others will charge extrafor them.It is also important to be specificabout parallel acceleration of theparticular features you use in thesoftware in question. For example,at the time of writing there are GPUaccelerated versions of denselinear algebra solvers in MATLAB,but not of sparse linear algebrasolvers [73]. Just because anapplication claims to be‘GPU-accelerated’, it does notnecessarily follow that yourparticular use of that application willgain the performance benefit ofGPUs. Your mileage will definitelyvary, so check with the supplier ofyour software to verify beforecommitting.Plan for parallelismIf you develop your own software,start thinking about what your ownpath to parallelism should be. Arethe users of your software likely tostick primarily to multi-coreprocessors in mainstream laptops,desktops and servers? If so youshould be thinking about adoptingOpenMP, MPI or another widelysupported approach for parallelprogramming. You should probablyavoid proprietary approaches thatmay not support all platforms or becommercially viable in the longterm. Instead, use open standardsavailable across multiple platformsand vendors to minimise your risk.Also consider when many-coreprocessors will feature in yourroadmap. These are inevitable —even the mainstream CPUs willrapidly become heterogeneousmany-cores, so this really is a‘when’ not an ‘if’. If you do not wantto support many-core processors inthe near or medium term, OpenMPand MPI will be good choices. If,however, you may want to supportmany-core processors within thenext few years, you will need a planto adopt either OpenCL or CUDAsooner rather than later. OpenCLmight be a viable alternative toOpenMP on multi-core CPUs in theshort term.If you are going to develop yourown many-core aware softwarethere is a tremendous amount ofsupport that you can tap into.Attend a workshopIn the UK each year there areseveral training workshops in theuse of GPUs. The nationalsupercomputing serviceHECToR [53] is starting to provideGPU workshops on CUDA andOpenCL programming; thetimetable for these is availableonline [52]. Prof Mike Giles at theUniversity of Oxford regularly runsCUDA programming workshops; forthe date of the next one see [43].His webpage also includesexcellent links to other GPUprogramming resources. A searchfor GPU training in the UK shouldturn up offerings from severaluniversities. There are alsocommercial training providers in theUK that are worth considering,such as NAG [89]. Daresbury Labshas a team who track the latestprocessor technologies and whoare already experienced indeveloping software for GPUs. Thisgroup holds occasional seminarsand workshops, and is willing tooffer advice and guidance fornewcomers to many-coretechnologies [30]. GPU vendorswill often provide assistance ifasked, especially if their assistancecould lead to new sales.There are also many conferencesand seminars emerging to addressmany-core computing. The UK nowhas a regular GPU developersworkshop. Previous years haveseen the workshop held inOxford [1] and Cambridge [2]. The2011 workshop is due to be held atImperial College.A useful GPU computing resourceis GPUcomputing.net [47]. Inparticular it has a page dedicatedto GPU computing in the UK [45].This site is mostly dominated by theuse of NVIDIA GPUs, reflectingNVIDIA’s lead in the market, butover time the site should see agreater percentage of contentcoming from work on a wider rangeof platforms.To get started with OpenCL, onegood place to start is AMD’s
Page 1 and 2: The GPU ComputingRevolutionFrom Mul
Page 3 and 4: THE GPU COMPUTING REVOLUTIONFrom Mu
Page 5 and 6: A KNOWLEDGE TRANSFER REPORT FROM TH
Page 19: A KNOWLEDGE TRANSFER REPORT FROM TH

The GPU Computing Revolution - London Mathematical Society

Create successful ePaper yourself

Delete template?

Save as template?