with CUDA Fortran
with CUDA Fortran
with CUDA Fortran
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Computing π <strong>with</strong> an Atomic Lock<br />
Instead of storing back the partial sum:<br />
! Each block writes back its partial sum<br />
if (index == 1) partial(BlockIdx%x)=psum(1)<br />
use an atomic lock to ensure that one block at the time updates the final sum:<br />
if (index == 1) then<br />
do while ( atomiccas(lock,0,1) == 1) !set lock<br />
end do<br />
partial(1)=partial(1)+psum(1) ! atomic update of partial(1)<br />
call threadfence() ! Wait for memory transaction to be visible to all the other threads<br />
lock =0 ! release lock<br />
end if<br />
partial(1)=0<br />
call sum(deviceData,partial,N)<br />
inside=partial(1)