19.11.2014 Views

with CUDA Fortran

with CUDA Fortran

with CUDA Fortran

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Execution Configuration<br />

• GPUs are high latency, 100s of cycles per device memory<br />

request<br />

• For good performance, you need to ensure there is enough<br />

parallelism to hide this latency<br />

• Such parallelism can come from:<br />

– Thread-level parallelism<br />

– Instruction-level parallelism

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!