10.07.2015 Views

ATI Stream Computing OpenCL Programming Guide - CiteSeerX

ATI Stream Computing OpenCL Programming Guide - CiteSeerX

ATI Stream Computing OpenCL Programming Guide - CiteSeerX

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>ATI</strong> STREAM COMPUTINGCL_MEM_COPY_HOST_PTR flag instructs the runtime to copy over thecontents of the host pointer pX in order to initialize the buffer bufX. The bufXbuffer uses the CL_MEM_READ_ONLY flag, while bufY requires theCL_MEM_READ_WRITE flag.bufX = cl::Buffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR,sizeof(cl_float) * length, pX);8. Create a program object from the kernel source string, build the program forour devices, and create a kernel object corresponding to the SAXPY kernel.(At this point, it is possible to create multiple kernel objects if there are morethan one.)cl::Program::Sources sources(1, std::make_pair(kernelStr.c_str(),kernelStr.length()));program = cl::Program(context, sources);program.build(devices);kernel = cl::Kernel(program, "saxpy");9. Enqueue the kernel for execution on the device (GPU in our example).Set each argument individually in separate kernel.setArg() calls. Thearguments, do not need to be set again for subsequent kernel enqueue calls.Reset only those arguments that are to pass a new value to the kernel. Then,enqueue the kernel to the command queue with the appropriate global andlocal work sizes.kernel.setArg(0, bufX);kernel.setArg(1, bufY);kernel.setArg(2, a);queue.enqueueNDRangeKernel(kernel, cl::NDRange(),cl::NDRange(length), cl::NDRange(64));10. Read back the results from bufY to the host pointer pY. We will make this ablocking call (using the CL_TRUE argument) since we do not want to proceedbefore the kernel has finished execution and we have our results back.queue.enqueueReadBuffer(bufY, CL_TRUE, 0, length * sizeof(cl_float),pY);11. Clean up the host resources (pX and pY). <strong>OpenCL</strong> resources is cleaned upby the C++ bindings support code.The catch(cl::Error err) block handles exceptions thrown by the C++bindings code. If there is an <strong>OpenCL</strong> call error, it prints out the name of the calland the error code (codes are defined in CL/cl.h). If there is a kernel compilationerror, the error code is CL_BUILD_PROGRAM_FAILURE, in which case it isnecessary to print out the build log.1.9 Example Programs 1-23Copyright © 2010 Advanced Micro Devices, Inc. All rights reserved.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!