19.11.2014 Views

Tutorial: Introduction to CUDA Fortran | GTC 2013

Tutorial: Introduction to CUDA Fortran | GTC 2013

Tutorial: Introduction to CUDA Fortran | GTC 2013

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Asynchronous Data Transfers<br />

• Asynchronous host-device transfers return control immediately<br />

<strong>to</strong> CPU<br />

– cudaMemcpyAsync(dst, src, nElements, stream)<br />

– Requires pinned host memory<br />

• Overlapping data transfer with CPU computation<br />

– default stream = 0<br />

istat = cudaMemcpyAsync(a_d, a_h, N, 0)<br />

call kernel(a_d)<br />

call cpuFunction(b)<br />

overlapped

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!