30.01.2015 Views

Parallel Computing On Qualcomm Platforms Using OpenCL - Uplinq

Parallel Computing On Qualcomm Platforms Using OpenCL - Uplinq

Parallel Computing On Qualcomm Platforms Using OpenCL - Uplinq

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>OpenCL</strong> Memory Model<br />

• Memory Types<br />

– Private: on-chip, temp register file<br />

– Local: on-chip, read-write, shared within a compute<br />

unit<br />

– Constant: on-chip memory, read-only, shared within<br />

compute unit<br />

– Global: system memory, shared between all<br />

compute units, can be on host or device<br />

• Memory model<br />

– <strong>OpenCL</strong> provides barrier mechanism for<br />

synchronization<br />

o Memory is assumed to be undefined across work items<br />

unless explicitly synchronized<br />

– Multiple distinct address spaces<br />

– Address spaces can be collapsed depending on the<br />

device’s memory subsystem<br />

• Diagram shows typical memory layout for a<br />

GPU architecture<br />

t<br />

Private<br />

Memory<br />

Private<br />

Memory<br />

Private<br />

Memory<br />

Private<br />

Memory<br />

ALU #1 ALU #M ALU #1 ALU #M<br />

Local Memory<br />

Compute Unit 1<br />

Compute Device<br />

Local Memory<br />

Compute Unit N<br />

Global / Constant Memory Data Cache<br />

Global Memory<br />

Compute Device Memory<br />

21<br />

<strong>Qualcomm</strong> Proprietary

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!