Lecture 2 â Threads - many-core.group
Lecture 2 â Threads - many-core.group
Lecture 2 â Threads - many-core.group
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
GPU strategy 2 – kernel (part 1)<br />
// allocate an array in shared memory<br />
__shared__ float temp[NI_TILE][NJ_TILE];<br />
// find i and j indices of current thread<br />
ti = threadIdx.x;<br />
tj = threadIdx.y;<br />
i = blockIdx.x*(NI_TILE-2) + ti;<br />
j = blockIdx.y*(NJ_TILE-2) + tj;<br />
// index into linear memory for current thread<br />
i2d = i + ni*j;<br />
// if thread is in domain, read from global to shared memory<br />
if (i2d < ni*nj) {<br />
temp[ti][tj] = temp_in[i2d];<br />
}<br />
// make sure all threads have read in data<br />
__syncthreads();