Tutorial CUDA
Tutorial CUDA
Tutorial CUDA
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Reduce 2<br />
Thread IDs<br />
Block IDs<br />
Distribute threads differently to achieve coalesced<br />
memory reads<br />
© NVIDIA Corporation 2008<br />
Bonus: No need to ping pong anymore<br />
… … … … … … …<br />
0 … t-1<br />
0<br />
t = numThreadsPerBlock<br />
b = numBlocks<br />
0 … t-1<br />
… b-1<br />
…<br />
Thread IDs 0 1 …<br />
0 … t-1<br />
0<br />
0 … t-1<br />
… b-1<br />
Elements read by a warp<br />
in one memory access<br />
0<br />
…<br />
31 … t-1<br />
…