29.01.2013 Views

Tutorial CUDA

Tutorial CUDA

Tutorial CUDA

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Hardware Implementation:<br />

Execution Model<br />

Each active block is split<br />

into warps in a welldefined<br />

way<br />

Warps are time-sliced<br />

In other words:<br />

Threads within a warp are<br />

executed physically in<br />

parallel<br />

Warps and blocks are<br />

executed logically in<br />

parallel<br />

© NVIDIA Corporation 2008<br />

Host<br />

Kernel<br />

1<br />

Kernel<br />

2<br />

Block (1, 1)<br />

Thread<br />

(0, 0)<br />

Thread<br />

(0, 1)<br />

Thread<br />

(0, 2)<br />

…<br />

Device<br />

Grid 1<br />

Block<br />

(0, 0)<br />

Block<br />

(0, 1)<br />

Grid 2<br />

Thread<br />

(31, 0)<br />

Block<br />

(1, 0)<br />

Block<br />

(1, 1)<br />

Warp 0 Warp 1<br />

Thread<br />

(32, 0)<br />

…<br />

Warp 2 Warp 3<br />

…<br />

Thread<br />

(31, 1)<br />

Thread<br />

(32, 1)<br />

…<br />

Warp 4 Warp 5<br />

…<br />

Thread<br />

(31, 2)<br />

Thread<br />

(32, 2)<br />

…<br />

Block<br />

(2, 0)<br />

Block<br />

(2, 1)<br />

Thread<br />

(63, 0)<br />

Thread<br />

(63, 1)<br />

Thread<br />

(63, 2)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!