12.07.2015 Views

PGI User's Guide

PGI User's Guide

PGI User's Guide

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

TerminologyAvailability84The <strong>PGI</strong> 11.1 Fortran & C Accelerator compilers are available only on x86 processor-based workstations andservers with an attached NVIDIA CUDA-enabled GPU or Tesla card. These compilers target all platforms that<strong>PGI</strong> supports except 64-bit Mac OS X. All examples included in this chapter are developed and presented onsuch a platform. For a list of supported GPUs, refer to the Accelerator Installation and Supported Platforms listin the latest <strong>PGI</strong> Release Notes.User-directed Accelerator ProgrammingIn user-directed accelerator programming the user specifies the regions of a host program to be targeted foroffloading to an accelerator device. The bulk of a user’s program, as well as regions containing constructsthat are not supported on the targeted accelerator, are executed on the host. This chapter concentrates onspecification of loops and regions of code to be offloaded to an accelerator.Features Not Covered or ImplementedThis chapter does not describe features or limitations of the host programming environment as a whole.Further, it does not cover automatic detection and offloading of regions of code to an accelerator by a compileror other tool. While future versions of the <strong>PGI</strong> compilers may allow for automatic offloading or multipleaccelerators of different types, these features are not currently supported.TerminologyClear and consistent terminology is important in describing any programming model. This section providesdefinitions of the terms required for you to effectively use this chapter and the associated programming model.Acceleratora special-purpose co-processor attached to a CPU and to which the CPU can offload data and executablekernels to perform compute-intensive calculations.Compute intensityfor a given loop, region, or program unit, the ratio of the number of arithmetic operations performed oncomputed data divided by the number of memory transfers required to move that data between two levelsof a memory hierarchy.Compute regiona region defined by an Accelerator compute region directive. A compute region is a structured blockcontaining loops which are compiled for the accelerator. A compute region may require device memoryto be allocated and data to be copied from host to device upon region entry, and data to be copied fromdevice to host memory and device memory deallocated upon exit. Compute regions may not contain othercompute regions or data regions.CUDAstands for Compute Unified Device Architecture; the CUDA environment from NVIDIA is a C-likeprogramming environment used to explicitly control and program an NVIDIA GPU.Data regiona region defined by an Accelerator data region directive, or an implicit data region for a function orsubroutine containing Accelerator directives. Data regions typically require device memory to be allocated

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!