Tutorial CUDA
Tutorial CUDA
Tutorial CUDA
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Hardware Implementation:<br />
Memory Architecture<br />
The global, constant, and texture<br />
spaces are regions of device<br />
memory<br />
Each multiprocessor has:<br />
A set of 32-bit registers per<br />
processor (8192 on G80)<br />
On-chip shared memory (16 K on<br />
G80)<br />
© NVIDIA Corporation 2008<br />
Where the shared memory<br />
space resides<br />
A read-only constant cache<br />
To speed up access to the<br />
constant memory space<br />
A read-only texture cache<br />
To speed up access to the<br />
texture memory space<br />
Device<br />
Multiprocessor N<br />
Multiprocessor 2<br />
Multiprocessor 1<br />
Registers<br />
Processor 1<br />
Device memory<br />
Shared Memory<br />
Registers<br />
Processor 2<br />
…<br />
Registers<br />
Processor M<br />
Instruction<br />
Unit<br />
Constant<br />
Cache<br />
Texture<br />
Cache