12.07.2015 Views

GPU Performance Analysis and Optimization - GPU Technology ...

GPU Performance Analysis and Optimization - GPU Technology ...

GPU Performance Analysis and Optimization - GPU Technology ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Case Study 4: Results• Original code (Array of Structures):– Time: 22.8 ms– 24.5 transactions per load, 6 transactions per store• Code using read-only loads:– Time: 15.7 ms (1.45x speedup over original)• Code with Structure of Arrays data layout:– Time: 9.3 ms (2.45x speedup over original)– Successive threads access successive words• 3 transactions per load request– Due to offset halo reads: addressed with non-caching loads: 8.9ms (2.56x speedup)• 2 transactions per store request© 2012, NVIDIA65

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!