Tutorial: Introduction to CUDA Fortran | GTC 2013
Tutorial: Introduction to CUDA Fortran | GTC 2013
Tutorial: Introduction to CUDA Fortran | GTC 2013
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Thread-Level Parallelism<br />
• Execution configuration dictates number of threads per block<br />
– Limit on number of threads per block for each architecture<br />
• Number of concurrent blocks on a multiprocessor limited by<br />
– Register use per thread<br />
– Shared memory use per thread block<br />
– Limit on number of threads per multiprocessor<br />
• Occupancy<br />
– Ratio of actual <strong>to</strong> maximum number of concurrent threads per<br />
multiprocessor