CUDA Libraries and MPI+OpenMP+CUDA - Prace Training Portal
CUDA Libraries and MPI+OpenMP+CUDA - Prace Training Portal
CUDA Libraries and MPI+OpenMP+CUDA - Prace Training Portal
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
CUBLAS - Thunking versus non-ThunkingThunking:• Allows interfacing to existing applications without any changes ! wrappers• During each call, the wrappers allocate GPU memory, copy source data from CPUmemory space to GPU memory space, call CUBLAS, <strong>and</strong> finally copy back the resultsto CPU memory space <strong>and</strong> deallocate the GPGPU memory• Intended for light testing, call overheadNon-Thunking (default):• Existing applications need to be modified slightly to allocate/deallocate data the inGPGPU memory space (using CUBLAS_ALLOC <strong>and</strong> CUBLAS_FREE) <strong>and</strong> to copydata between GPU <strong>and</strong> CPU memory spaces (using CUBLAS_SET_VECTOR,CUBLAS_GET_VECTOR, CUBLAS_SET_MATRIX, <strong>and</strong> CUBLAS_GET_MATRIX)• Intended for production code, high flexibilityFebruary 10, 2012 PRACE Winter School 20129