13.07.2015 Views

Parallel Universe Issue 4 - XLsoft.com

Parallel Universe Issue 4 - XLsoft.com

Parallel Universe Issue 4 - XLsoft.com

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

PARALLEL>UNIVERSE :JAMES REINDERS 3 James Reinders 2010 ® <strong>Parallel</strong> Studio 2011:LEILA CHUCRI 5 ® <strong>Parallel</strong> Studio 2011 39 STEPHEN BLAIR-CHAPPELL 11 39 Lars PetersEndresen Håvard Graff 2


PARALLEL>UNIVERSEJames>Reinders IntelThreading Building Blocks: Outfitting C++ for Multicore Processor<strong>Parallel</strong>ism>1. ®><strong>Parallel</strong>>Building>Blocks>2010 9 ® <strong>Parallel</strong> Studio 2011 ®<strong>Parallel</strong> Studio 2011 > – ® <strong>Parallel</strong> Building Blocks ® PBB> – ® <strong>Parallel</strong> Advisor 2011> Microsoft* Visual Studio* 200520082010 > ® ® <strong>Parallel</strong> Studio 2011 ® PBB ® ® TBBAdobe* Creative Suite* 5 ® TBB ® TBB ® <strong>Parallel</strong> Studio Windows* ® PBB ® TBB ® Cilk Plus ® Array3


PARALLEL>UNIVERSE >®><strong>Parallel</strong>>Advisor ®><strong>Parallel</strong>>Composer ®><strong>Parallel</strong>>Inspector ®><strong>Parallel</strong>>Amplifier> > > C++ > > > ® <strong>Parallel</strong> Building Blocks > > > > > > >2:>®><strong>Parallel</strong>>Studio>2011Building Blocks ® ArBB® TBB ® TBB FIFO C++ 0x concurrent_unordered_mapMicrosoft* Visual Studio* 2010 tbb::graph ®>Cilk>Plus>>4> :> > 3 : for cilk_for cilk_spawn cilk_sync> > > > 2 a[] = b[] + c[] a[][1] = sqrt( b[][2] )> > __sec_map(saxpy, 2.0, x[0:n], y[0:n]) ® Cilk Plus ® TBB ® Cilk Plus (1) (2) ® Cilk Plus 4 ® Cilk Plus ® C/C++ ® ArBB SIMD / ® <strong>Parallel</strong> Studio 2011 ® <strong>Parallel</strong> Studio2009 5 ® <strong>Parallel</strong> Advisor ®PBB ® <strong>Parallel</strong> Advisor James>Reinders2010 9 4


PARALLEL>UNIVERSE®><strong>Parallel</strong>>Studio>2011>>>Leila>Chucri ® <strong>Parallel</strong> Studio 2011 >> 1® <strong>Parallel</strong> Studio Microsoft* Visual Studio* PPL ®><strong>Parallel</strong>>Studio>2011> / ® <strong>Parallel</strong>Studio 2011 5


PARALLEL>UNIVERSE®><strong>Parallel</strong>>Studio>2011>® <strong>Parallel</strong> Studio 2011 Microsoft* Visual Studio* 200520082010 C/C++ > ® <strong>Parallel</strong> Building Blocks ® PBB : ®<strong>Parallel</strong> Building Blocks > > ®>>3.0:> C++ > > ®>Cilk>Plus:>® C/C++ > > ®>Array>Building>Blocks:> API software.intel.<strong>com</strong>/enus/data-parallel/> > ®><strong>Parallel</strong>>Advisor:>> > Microsoft*>Visual>Studio*>2010>>® TBB ® Core i7 Simul Weather SDK Simul Weather* Simul Software Roderick Kennedy 6


PARALLEL>UNIVERSE®><strong>Parallel</strong>>Advisor:>® <strong>Parallel</strong> Advisor ® <strong>Parallel</strong> Advisor ® <strong>Parallel</strong> Advisor C/C++ ROI® <strong>Parallel</strong> Advisor ®<strong>Parallel</strong> Advisor CPU William Orttung ® <strong>Parallel</strong> Advisor Vickery Research AllianceMatt Osterberg ®><strong>Parallel</strong>>Advisor>> > > > Microsoft* Visual Studio* C++ ® <strong>Parallel</strong> AdvisorBrian Reynolds ResearchBrian Reynolds 7


PARALLEL>UNIVERSE ®><strong>Parallel</strong>>Composer>2011:> C/C++ ® <strong>Parallel</strong> Composer 2011 C/C++ ® ® IPP®<strong>Parallel</strong> Building Blocks ® PBB® <strong>Parallel</strong> Composer ® ® <strong>Parallel</strong> Studio 2011 Trading Systems Lab ® <strong>Parallel</strong> Studio C++ TSL Algo Auto-Design Platform 10% 20% Microsoft*Visual C++* <strong>Parallel</strong> Studio Trading Systems LabMike Barna 8


PARALLEL>UNIVERSE ®><strong>Parallel</strong>>Inspector>2011:> / ROI® <strong>Parallel</strong> Inspector / SIMULIA ® <strong>Parallel</strong> Inspector SIMULIA Matt Dunbar 9


PARALLEL>UNIVERSE®><strong>Parallel</strong>>Amplifier>2011:> / ® <strong>Parallel</strong> Amplifier 2011 Windows* DEVELOPERSPOTLIGHT> & ® Paragon® Itanium ®Xeon ISV ® <strong>Parallel</strong>Amplifier 10 GUI Dat Chu ® Atom /SoC HPC®Atom ISV SFF SFF OS MeeGO* HPC > SFF PC ® Atom ® Atom SFF HPC SFF 10


39 Intel Compiler LabsStephen>Blair-Chappell>> 39 Lars Peters Endresen Håvard Graff >9x9>>6>x>10 21 > 38 39 3 > 1: - > 2: - SSE > 3: 2 2,000-3,000 >1:> 1 2 17 18 2 1 1 3 9 -2 + 1 39 38 1 2 -1+ 2>2:> CPU SIMD SingleInstruction Multiple DataSIMD MMX SIMD SSESSE2...11


PARALLEL>UNIVERSESSE>SSE C/C++ SIMD C/C++ ® MMX SSE4.2 2 128 SSE2 SSE SSE2 SSESSE SSE C++ >3:>OpenMP* 3.0 OpenMP* OpenMP* 3.0 ® C/C++ 11.0 2 3 OpenMP* OpenMP* 1 OpenMP* OpenMP*>®>Cilk>® <strong>Parallel</strong> Studio 4> ® <strong>Parallel</strong> Building Blocks 39 OpenMP* >1:>>17>for(int num=0; num < 9; num++){__m128i xmm0 = _mm_and_si128(BinSmallNum, BinNum[num]);for(int i=0; i < 9; i++){__m128i BoxSum = _mm_and_si128(BinBox[i], xmm0);__m128i RowSum = _mm_and_si128(BinRow[i], xmm0);__m128i ColumnSum = _mm_and_si128(BinColumn[i], xmm0);}if (ExactlyOneBit(BoxSum)){int cell=BitToNum(BoxSum);FoundNumber(cell, num);return true;}}1 1 xmm0 i >2:>SSE>12


PARALLEL>UNIVERSE® Cilk Plus C Cilk Cilk 3 cilk_spawncilk_sync cilk_for cilk.h 5Cilk Cilk Cilk ® Cilk Plus 6 Cilk cilk_for OpenMP* > 3 ®>Cilk>Plus> 1 Cilk get_value() 7 gNumCilkPuzzlesSolved reducer_opadd get_value() #pragma omp parallel{#pragma omp single nowait{for( int i=0; i< NUM_NODES -1; i++){NODE Node1 = pPuzzle ->Nodes [i];if (Node1.number > 0){//;memcpy (&gPuzzles[i];pPuzzle, sizeof (SUDOKU));#pragma omp taskprivate (i)GenDoWork (&gPuzzles[i],i;}}}}1 1 for GenDoWork ()>3.>OpenMP* >4. ®><strong>Parallel</strong>>Building>Blocks>13


PARALLEL>UNIVERSE#include void work(int num}{// }void func1(){cilk_spawn work(1);work(2);cilk_sync;}void func2(){cilk_for(int i=0; i>cilk_synccilk_spawn cilk_sync cilk_spawn work(1) work(1) cilk_synccilk_forC/C++ for } 5. ®>Cilk>Plus>>3>#include ...cilk_for(int i = 0 ; i < NUM_NODES -1; i++ ){}NODE Node1 = pPuzzle->Nodes[i];if(Node1.number > 0){// ;memcpy(&gPuzzles[i],pPuzzle,sizeof(SUDOKU));GenDoWork(&gPuzzles[i],i);}>6.>>cilk_for>14


PARALLEL>UNIVERSEint gNumCilkPuzzlesSolved; // ..gNumCilkPuzzlesSolved++;// .int Tmp = gNumCilkPuzzlesSolved; // (a) gNumCilkPuzzlesSolved #include cilk::reducer_opadd gNumCilkPuzzlesSolved;..gNumCilkPuzzlesSolved++;// ..int Tmp = gNumCilkPuzzlesSolved.get_value();// (b) >7.>®>Cilk>Plus>>8.> >CPU> >100%>15


PARALLEL>UNIVERSE OpenMP* SMT 8 8 8 Cilk Cilk OpenMP* > 9 3 39 >9.-1>+>2>>3>>39>®>Cilk >Plus®>Cilk>Plus>> 3 > > C/C++ > / > simd ® C/C++ SIMD ** OpenMP Lars Peters Endresen Håvard Graff Stephen Blair-Chappell ® Cilk Plus WROX <strong>Parallel</strong> Programming with Intel <strong>Parallel</strong>Studio Stephen Blair-Chappell Andrew StokesWiley Publishing Inc. ISBN 9780470891650March 2011® Web http://www.intel.co.jp/jp/software/products/16


PARALLEL>UNIVERSE® (SIMD ) ® ( ® ) ® ® ® ® ® SIMD 2 ( ® SSE2)® SIMD 3 ( ® SSE3)SIMD 3 ( ® SSSE3) ® #2010110117© 2010 Intel Corporation. IntelIntel Intel AtomIntel CoreItaniumXeon Intel Corporation *

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!