GPU Compute accelerated HEVC decoder on ARM® MaliTM-T600 ...
GPU Compute accelerated HEVC decoder on ARM® MaliTM-T600 ...
GPU Compute accelerated HEVC decoder on ARM® MaliTM-T600 ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Challenges in CPU+<str<strong>on</strong>g>GPU</str<strong>on</strong>g> Implementati<strong>on</strong><br />
Efficient Partiti<strong>on</strong>ing<br />
of work between<br />
CPU and <str<strong>on</strong>g>GPU</str<strong>on</strong>g><br />
• The effective FPS of <str<strong>on</strong>g>decoder</str<strong>on</strong>g> will be the minimum of the FPS<br />
achieved by the CPU and <str<strong>on</strong>g>GPU</str<strong>on</strong>g> for their respective work<br />
• So the partiti<strong>on</strong>ing needs to be efficient so that both of them<br />
perform their respective work at almost the same speed(FPS)<br />
Efficient pipelining<br />
data between CPU<br />
and <str<strong>on</strong>g>GPU</str<strong>on</strong>g><br />
• The algorithms running <strong>on</strong> CPU will depend <strong>on</strong> the output of<br />
algorithms from <str<strong>on</strong>g>GPU</str<strong>on</strong>g> and/or vice versa<br />
• A good design should make sure neither the CPU nor the <str<strong>on</strong>g>GPU</str<strong>on</strong>g><br />
spend any time waiting for the output of the other<br />
Cache coherency<br />
• Cache coherency between CPU and <str<strong>on</strong>g>GPU</str<strong>on</strong>g> data need to<br />
ensured.