Tutorial: Introduction to CUDA Fortran | GTC 2013
Tutorial: Introduction to CUDA Fortran | GTC 2013
Tutorial: Introduction to CUDA Fortran | GTC 2013
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Computing π with an A<strong>to</strong>mic Lock<br />
Instead of s<strong>to</strong>ring back the partial sum:<br />
! Each block writes back its partial sum<br />
if (index == 1) partial(BlockIdx%x)=psum(1)<br />
use an a<strong>to</strong>mic lock <strong>to</strong> ensure that one block at the time updates the final sum:<br />
if (index == 1) then<br />
do while ( a<strong>to</strong>miccas(lock,0,1) == 1) !set lock<br />
end do<br />
partial(1)=partial(1)+psum(1) ! a<strong>to</strong>mic update of partial(1)<br />
call threadfence() ! Wait for memory transaction <strong>to</strong> be visible <strong>to</strong> all the other threads<br />
lock =0 ! release lock<br />
end if<br />
partial(1)=0<br />
call sum(deviceData,partial,N)<br />
inside=partial(1)