19.11.2014 Views

Tutorial: Introduction to CUDA Fortran | GTC 2013

Tutorial: Introduction to CUDA Fortran | GTC 2013

Tutorial: Introduction to CUDA Fortran | GTC 2013

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Execution Configuration<br />

• GPUs are high latency, 100s of cycles per device memory<br />

request<br />

• For good performance, you need <strong>to</strong> ensure there is enough<br />

parallelism <strong>to</strong> hide this latency<br />

• Such parallelism can come from:<br />

– Thread-level parallelism<br />

– Instruction-level parallelism

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!