31.07.2014 Views

Implementing Finite Volume algorithms on GPUs - many-core.group ...

Implementing Finite Volume algorithms on GPUs - many-core.group ...

Implementing Finite Volume algorithms on GPUs - many-core.group ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

What block-size should we use? (x-directi<strong>on</strong>)<br />

When performing flux-calculati<strong>on</strong>, want to reduce no. of overlap cells<br />

relative to block size<br />

Block size Overall time<br />

16 × 8 7.13s<br />

32 × 4 6.93s<br />

64 × 2 6.95s<br />

32 × 8 7.02s<br />

64 × 4 6.98<br />

First 4 lines can have 4 thread blocks per multiprocessor (register<br />

limitati<strong>on</strong>)<br />

Last two lines are limited to 3 thread blocks per multiprocessor.<br />

<str<strong>on</strong>g>Finite</str<strong>on</strong>g> <str<strong>on</strong>g>Volume</str<strong>on</strong>g> Methods<br />

Laboratory for Scientific<br />

Computing<br />

16 / 22

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!