Memory Bandwidth Limited Kernels - GPU Technology Conference
Memory Bandwidth Limited Kernels - GPU Technology Conference
Memory Bandwidth Limited Kernels - GPU Technology Conference
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Non-caching Load• Warp requests 32 misaligned, consecutive 4-byte words• Addresses fall within at most 5 segments• Warp needs 128 bytes• 160 bytes move across the bus on misses• Bus utilization: at least 80%- Some misaligned patterns will fall within 4 segments, so 100% utilizationaddresses from a warp...0 32 64 96 128 160 192 224 256 288 320 352 384 416 448<strong>Memory</strong> addresses© NVIDIA Corporation 2011