26.12.2014 Views

DirectCompute Optimizations and Best Practices - Nvidia

DirectCompute Optimizations and Best Practices - Nvidia

DirectCompute Optimizations and Best Practices - Nvidia

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Matrix Multiplication (cont.)<br />

Optimization<br />

GeForce<br />

GTX 280<br />

GeForce<br />

GTX 8800<br />

No optimization<br />

Coalesced using local<br />

memory to store a tile<br />

of A<br />

Using thread group<br />

shared memory to<br />

eliminate redundant<br />

reads of a tile of B<br />

8.8 GBps 0.7 GBps<br />

14.3 GBps 8.2 GBps<br />

29.7 GBps 15.7 GBps

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!