29.01.2013 Views

Tutorial CUDA

Tutorial CUDA

Tutorial CUDA

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Language Extensions:<br />

Execution Configuration<br />

A kernel function must be called with an execution<br />

configuration:<br />

__global__ void KernelFunc(...);<br />

dim3 DimGrid(100, 50); // 5000 thread blocks<br />

dim3 DimBlock(4, 8, 8); // 256 threads per block<br />

size_t SharedMemBytes = 64; // 64 bytes of shared memory<br />

KernelFunc>(...);<br />

© NVIDIA Corporation 2008<br />

The optional SharedMemBytes bytes are:<br />

Allocated in addition to the compiler allocated shared memory<br />

Mapped to any variable declared as:<br />

extern __shared__ float DynamicSharedMem[];<br />

Any call to a kernel function is asynchronous<br />

Control returns to CPU immediately

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!