10.07.2015 Views

ATI Stream Computing OpenCL Programming Guide - CiteSeerX

ATI Stream Computing OpenCL Programming Guide - CiteSeerX

ATI Stream Computing OpenCL Programming Guide - CiteSeerX

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>ATI</strong> STREAM COMPUTINGContentsPrefaceContentsChapter 1<strong>OpenCL</strong> Architecture and the <strong>ATI</strong> <strong>Stream</strong> <strong>Computing</strong> System1.1 Software Overview........................................................................................................................... 1-11.1.1 Data-Parallel <strong>Programming</strong> Model .................................................................................1-11.1.2 Task-Parallel <strong>Programming</strong> Model .................................................................................1-11.1.3 Synchronization................................................................................................................1-21.2 Hardware Overview.......................................................................................................................... 1-21.3 The <strong>ATI</strong> <strong>Stream</strong> <strong>Computing</strong> Implementation of <strong>OpenCL</strong>............................................................. 1-41.3.1 Work-Item Processing .....................................................................................................1-71.3.2 Flow Control .....................................................................................................................1-81.3.3 Work-Item Creation ..........................................................................................................1-91.3.4 <strong>ATI</strong> Compute Abstraction Layer (CAL)..........................................................................1-91.4 Memory Architecture and Access................................................................................................ 1-101.4.1 Memory Access..............................................................................................................1-121.4.2 Global Buffer...................................................................................................................1-121.4.3 Image Read/Write ...........................................................................................................1-121.4.4 Memory Load/Store........................................................................................................1-131.5 Communication Between Host and GPU in a Compute Device............................................... 1-131.5.1 PCI Express Bus ............................................................................................................1-131.5.2 Processing API Calls: The Command Processor ......................................................1-131.5.3 DMA Transfers ................................................................................................................1-141.6 GPU Compute Device Scheduling ............................................................................................... 1-141.7 Terminology .................................................................................................................................... 1-161.7.1 Compute Kernel..............................................................................................................1-161.7.2 Wavefronts and Workgroups ........................................................................................1-171.7.3 Local Data Store (LDS)..................................................................................................1-171.8 <strong>Programming</strong> Model ...................................................................................................................... 1-171.9 Example Programs......................................................................................................................... 1-191.9.1 First Example: Simple Buffer Write .............................................................................1-191.9.2 Second Example: SAXPY Function .............................................................................1-221.9.3 Third Example: Parallel Min() Function.......................................................................1-27ContentsCopyright © 2010 Advanced Micro Devices, Inc. All rights reserved.vii

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!