DirectCompute Optimizations and Best Practices - Nvidia
DirectCompute Optimizations and Best Practices - Nvidia
DirectCompute Optimizations and Best Practices - Nvidia
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Performance for 4M element reduction<br />
Shader 1:<br />
interleaved addressing<br />
with divergent branching<br />
Time (2 22 ints)<br />
B<strong>and</strong>width<br />
8.054 ms 2.083 GB/s<br />
Step<br />
Speedup<br />
Cumulative<br />
Speedup<br />
Shader 2:<br />
interleaved addressing<br />
with bank conflicts<br />
3.456 ms 4.854 GB/s 2.33x 2.33x