12.07.2015 Views

GPU Performance Analysis and Optimization - GPU Technology ...

GPU Performance Analysis and Optimization - GPU Technology ...

GPU Performance Analysis and Optimization - GPU Technology ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Pattern Category 1: Offset Access• Cause:– Region addressed by a warp is not aligned on cache-line boundary• Issue:– Wasted b<strong>and</strong>width: only a fraction of some lines is used– Some increase in latency• Symptom:– Transactions per request 1.5-2.0x higher than ideal– Likely: moderate to medium L1 hit rate• Remedies:– Extra padding for data to force alignment– Try non-caching loads, read-only loads• Reduce overfetched bytes, but don’t fully solve the problem© 2012, NVIDIA49

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!