Introduction to CUDA C/C++

More documents

Recommendations

Info

CUDA Threads • Terminology: a block can be split into parallel threads • Let’s change add() to use parallel threads instead of parallel blocks Using blocks: __global__ __global__ void add(int void *a, add(int *b, *a, int *c) *b, { int *c) { c[blockIdx.x] c[blockIdx.x] = a[blockIdx.x] = a[blockIdx.x] + b[blockIdx.x]; + b[blockIdx.x]; } } add(d_a, d_b, d_c); Using threads: __global__ void add(int *a, int *b, int *c) { c[threadIdx.x] = a[threadIdx.x] + b[threadIdx.x]; } add(d_a, d_b, d_c); © NVIDIA Corporation 2011
CONCEPTS Heterogeneous Computing Blocks Threads Indexing Shared memory __syncthreads() Asynchronous operation COMBINING THREADS AND BLOCKS Handling errors Managing devices © NVIDIA Corporation 2011
Page 1 and 2: An Introduction to GPU Computing an
Page 3 and 4: What is CUDA? • CUDA Architecture
Page 5 and 6: Prerequisites • You (probably) ne
Page 7 and 8: CONCEPTS Heterogeneous Computing Bl
Page 9 and 10: Heterogeneous Computing #include #
Page 11 and 12: Simple Processing Flow PCI Bus 1. C
Page 13 and 14: Hello World! int main(void) { print
Page 15 and 16: Hello World! with Device Code __glo
Page 17 and 18: Hello World! with Device Code __glo
Page 19 and 20: Addition on the Device • A simple
Page 21 and 22: Memory Management • Host and devi
Page 23 and 24: Addition on the Device: main() int
Page 27 and 28: Vector Addition on the Device • W
Page 29 and 30: Vector Addition on the Device: main
Page 31 and 32: Review (1 of 2) • Difference betw
Page 33: CONCEPTS Heterogeneous Computing Bl
Page 37 and 38: Indexing Arrays with Blocks and Thr
Page 39 and 40: Addition with Blocks and Threads: m
Page 41 and 42: Handling Arbitrary Vector Sizes •
Page 45 and 46: Implementing Within a Block • Eac
Page 47 and 48: Implementing With Shared Memory •
Page 49 and 50: Stencil Kernel // Apply the stencil
Page 51 and 52: __syncthreads() • void __syncthre
Page 53 and 54: Stencil Kernel // Apply the stencil
Page 55 and 56: Review (2 of 2) • Use __shared__
Page 57 and 58: Coordinating Host & Device • Kern
Page 59 and 60: Device Management • Application c
Page 61 and 62: Resources • We skipped some detai

Introduction to CUDA C/C++

Create successful ePaper yourself

Delete template?

Save as template?