26.12.2014 Views

DirectCompute Optimizations and Best Practices - Nvidia

DirectCompute Optimizations and Best Practices - Nvidia

DirectCompute Optimizations and Best Practices - Nvidia

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

What is Our Optimization Goal<br />

• We should strive to reach GPU peak performance<br />

• Choose the right metric:<br />

— GFLOP/s: for compute-bound shaders<br />

— B<strong>and</strong>width: for memory-bound shaders<br />

• Reductions have very low arithmetic intensity<br />

— 1 flop per element loaded (b<strong>and</strong>width-optimal)<br />

• Therefore we should strive for peak b<strong>and</strong>width<br />

• We use a G80 GPU for this Optimization<br />

— 384-bit memory interface, 900 MHz DDR<br />

— 384 * 1800 / 8 = 86.4 GB/s<br />

— Optimization techniques are equally applicable to newer GPUs

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!