19.11.2014 Views

Tutorial: Introduction to CUDA Fortran | GTC 2013

Tutorial: Introduction to CUDA Fortran | GTC 2013

Tutorial: Introduction to CUDA Fortran | GTC 2013

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Reductions on GPU<br />

3 1 7 0 4 1 6 3<br />

• Parallelism across blocks<br />

• Parallelism within a block<br />

• No global synchronization<br />

4 7 5 9<br />

11 14<br />

– two-stage approach (two kernel lauches), same code for both stages<br />

25<br />

3 1 7 0 4 1 6 3<br />

4 7 5 9<br />

11 14<br />

25<br />

3 1 7 0 4 1 6 3<br />

4 7 5 9<br />

11 14<br />

25<br />

3 1 7 0 4 1 6 3<br />

4 7 5 9<br />

11 14<br />

25<br />

3 1 7 0 4 1 6 3<br />

4 7 5 9<br />

11 14<br />

25<br />

3 1 7 0 4 1 6 3<br />

4 7 5 9<br />

11 14<br />

25<br />

3 1 7 0 4 1 6 3<br />

4 7 5 9<br />

11 14<br />

25<br />

3 1 7 0 4 1 6 3<br />

4 7 5 9<br />

11 14<br />

25<br />

3 1 7 0 4 1 6 3<br />

4 7 5 9<br />

11 14<br />

25<br />

Level 0:<br />

8 blocks<br />

3 1 7 0 4 1 6 3<br />

4 7 5 9<br />

11 14<br />

25<br />

Level 1:<br />

1 block

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!