DirectCompute Optimizations and Best Practices - Nvidia
DirectCompute Optimizations and Best Practices - Nvidia
DirectCompute Optimizations and Best Practices - Nvidia
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Performance for 4M element reduction<br />
Shader 1:<br />
interleaved addressing<br />
with divergent branching<br />
Time (2 22 ints)<br />
B<strong>and</strong>width<br />
8.054 ms 2.083 GB/s<br />
Note: Block Size = 128 threads for all tests