Lecture 2 â Threads - many-core.group

More documents

Recommendations

Info

GPU strategy 2 – threads and blocks
GPU strategy 2 – kernel (part 1) // allocate an array in shared memory __shared__ float temp[NI_TILE][NJ_TILE]; // find i and j indices of current thread ti = threadIdx.x; tj = threadIdx.y; i = blockIdx.x*(NI_TILE-2) + ti; j = blockIdx.y*(NJ_TILE-2) + tj; // index into linear memory for current thread i2d = i + ni*j; // if thread is in domain, read from global to shared memory if (i2d < ni*nj) { temp[ti][tj] = temp_in[i2d]; } // make sure all threads have read in data __syncthreads();
Page 1 and 2: Lecture 2 - Threads Graham Pullan D
Page 3 and 4: Threads, thread blocks and shared m
Page 5 and 6: Threads • Example in L1 made no u
Page 7 and 8: Kernel for c = a + b __global__ voi
Page 9 and 10: Threads, blocks and grid Block is s
Page 11 and 12: More on thread blocks • Thread bl
Page 13 and 14: Streaming multiprocessors and share
Page 15 and 16: Governing equation • Heat conduct
Page 17 and 18: 2D heat conduction • In 2D: ∂T
Page 19 and 20: Domain
Page 21 and 22: Results Initial field After 50000 s
Page 23 and 24: GPU strategy 1 • Start a thread f
Page 25 and 26: GPU strategy 1 - threads and blocks
Page 27 and 28: GPU strategy 1 - kernel launch code
Page 29 and 30: Performance • CPU - 1 core, Intel
Page 31: GPU strategy 2 - threads and blocks
Page 35 and 36: Performance • CPU - 1 core, Intel
Page 37 and 38: GPU strategy 3 • Can use larger b
Page 39 and 40: GPU strategy 3
Page 45 and 46: Lecture 2 summary

Lecture 2 â Threads - many-core.group

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?

Lecture 2 â Threads - many-core.group