11.07.2015 Views

Memory Bandwidth Limited Kernels - GPU Technology Conference

Memory Bandwidth Limited Kernels - GPU Technology Conference

Memory Bandwidth Limited Kernels - GPU Technology Conference

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Impact of Address Alignment• Warps should access aligned regions for maximum memory throughput• L1 can help for misaligned loads if several warps are accessing a contiguous region• ECC further significantly reduces misaligned store throughputExperiment:– Copy 16MB of floats– 256 threads/blockGreatest throughputdrop:– CA loads: 15%– CG loads: 32%© NVIDIA Corporation 2011

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!