13.07.2015 Views

CUDA Libraries and MPI+OpenMP+CUDA - Prace Training Portal

CUDA Libraries and MPI+OpenMP+CUDA - Prace Training Portal

CUDA Libraries and MPI+OpenMP+CUDA - Prace Training Portal

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Parallelization levels in PWSCFImages• Only for Nudged Elastic B<strong>and</strong> (NEB) calculationsGPU!K-pointsPlane-wavesLinear algebra &task groupsMulti-threadedkernels• Distribution over k-points (if more than one)• Scalability over k-points (if more than one)• No memory scaling• Distribution of wave-function coefficients• Distribution of real-grid points• Good memory scale, good overall scalability, LB• Iterative diagonalization (fully-parallel or serial)• Smart grouping of 3DFFTs to reduce compulsoryMPI communications• OpenMP h<strong>and</strong>led explicitly or implicitly• Extend the scaling on multi-core machines with“limited” memoryFebruary 10, 2012 PRACE Winter School 201241

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!