Tutorial CUDA
Tutorial CUDA
Tutorial CUDA
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Example: Avoiding Non-Coalesced<br />
float3 Memory Accesses<br />
float3 is 12 bytes<br />
Each thread ends up executing 3 reads<br />
sizeof(float3) ≠ 4, 8, or 16<br />
Half-warp reads three 64B non-contiguous regions<br />
© NVIDIA Corporation 2008<br />
t0 t1 t2 t3<br />
float3 float3 float3<br />
First read