Implementing Finite Volume algorithms on GPUs - many-core.group ...
Implementing Finite Volume algorithms on GPUs - many-core.group ...
Implementing Finite Volume algorithms on GPUs - many-core.group ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
What block-size should we use? (y-directi<strong>on</strong>)<br />
In y directi<strong>on</strong>, different effects come into play<br />
For global memory access coalesence, want to read several adjacent<br />
cells in x-directi<strong>on</strong>.<br />
So, 4 × 32 is not the obvious answer<br />
Block size Overall time<br />
16 × 8 6.93s<br />
32 × 4 8.41s<br />
4 × 32 6.54s<br />
8 × 16 6.51s<br />
8 × 32 6.45s<br />
<str<strong>on</strong>g>Finite</str<strong>on</strong>g> <str<strong>on</strong>g>Volume</str<strong>on</strong>g> Methods<br />
Laboratory for Scientific<br />
Computing<br />
17 / 22