29.01.2013 Views

Tutorial CUDA

Tutorial CUDA

Tutorial CUDA

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Arithmetic Instruction Throughput<br />

Reciprocal, reciprocal square root, sin/cos, log, exp:<br />

16 cycles per warp<br />

© NVIDIA Corporation 2008<br />

These are the versions prefixed with “__”<br />

Examples: __rcp(), __sin(), __exp()<br />

Other functions are combinations of the above:<br />

y/x==rcp(x)*y takes 20 cycles per warp<br />

sqrt(x)==rcp(rsqrt(x)) takes 32 cycles per warp

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!