CUDA Libraries and MPI+OpenMP+CUDA - Prace Training Portal
CUDA Libraries and MPI+OpenMP+CUDA - Prace Training Portal
CUDA Libraries and MPI+OpenMP+CUDA - Prace Training Portal
- No tags were found...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
WHEN <strong>and</strong> WHY more MPI processesper GPU is good?Usually <strong>CUDA</strong> optimizations are performed starting from a serial code! Visual Profiler (or directly on text file)Introducing a parallelization means distribute data! compute-footprint of some <strong>CUDA</strong> kernels might decrease or transfer-timeovercomes compute-time (even worst!)GPU performs better its “duty” (accelerator) when there is enoughcomputation to exploit all the parallelism of all SM! let’s safely share it <strong>and</strong> its resourcesLess speed-up of a single piece of computation but there is more interleavedwork on the GPU coming from different processes! kernels are less efficient but more can run concurrently ! PERFORMANCE!February 10, 2012 PRACE Winter School 201268