Tutorial: Introduction to CUDA Fortran | GTC 2013
Tutorial: Introduction to CUDA Fortran | GTC 2013
Tutorial: Introduction to CUDA Fortran | GTC 2013
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Thread-Level Parallelism<br />
• Run on K20<br />
– Maximum of 2048 concurrent threads per multiprocessor<br />
– Maximum of 16 concurrent blocks per multiprocessor<br />
– Maximum of 1024 threads per block<br />
Thread Block Size Occupancy Bandwidth (GB/s)<br />
32 0.25 96<br />
64 0.5 125<br />
128 1.0 136<br />
256 1.0 137<br />
512 1.0 137<br />
1024 1.0 133