Tutorial: Introduction to CUDA Fortran | GTC 2013
Tutorial: Introduction to CUDA Fortran | GTC 2013
Tutorial: Introduction to CUDA Fortran | GTC 2013
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Instruction-Level Parallelism<br />
• Have each thread process multiple elements<br />
attributes(global) subroutine copy_ILP(odata, idata)<br />
real :: odata(*), idata(*), tmp(ILP)<br />
integer :: i,j<br />
i = (blockIdx%x-1)*blockDim%x*ILP + threadIdx%x<br />
do j = 1, ILP<br />
tmp(j) = idata(i+(j-1)*blockDim%x)<br />
enddo<br />
do j = 1, ILP<br />
odata(i+(j-1)*blockDim%x) = tmp(j)<br />
enddo<br />
end subroutine copy_ILP