Tutorial: Introduction to CUDA Fortran | GTC 2013
Tutorial: Introduction to CUDA Fortran | GTC 2013
Tutorial: Introduction to CUDA Fortran | GTC 2013
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Execution Configuration<br />
• GPUs are high latency, 100s of cycles per device memory<br />
request<br />
• For good performance, you need <strong>to</strong> ensure there is enough<br />
parallelism <strong>to</strong> hide this latency<br />
• Such parallelism can come from:<br />
– Thread-level parallelism<br />
– Instruction-level parallelism