19.11.2014 Views

with CUDA Fortran

with CUDA Fortran

with CUDA Fortran

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Instruction-Level Parallelism <strong>with</strong> CUF<br />

Kernels<br />

• When product of specified grid and block size are smaller than<br />

loop in that dimension<br />

!$cuf kernel do (2) <br />

do j = 1 ny<br />

do i = 1, nx<br />

c_d(i,j) = a_d(i,j) + b_d(i,j)<br />

enddo<br />

enddo<br />

• If nx==1024, each thread calculates 4 elements of c_d

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!