DirectCompute Optimizations and Best Practices - Nvidia
DirectCompute Optimizations and Best Practices - Nvidia
DirectCompute Optimizations and Best Practices - Nvidia
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Idle Threads<br />
Problem:<br />
for (unsigned int s=groupDim_x/2; s>0; s>>=1) {<br />
if (tid < s) {<br />
sdata[tid] += sdata[tid + s];<br />
}<br />
GroupMemoryBarrierWithGroupSync();<br />
}<br />
Half of the threads are idle on first loop iteration!<br />
This is wasteful…