NVIDIA CUDA

More documents

Recommendations

Info

$Math 664 Homework #2: Solutions 1. Let Î© = R, F = {A â R : either A ...$

134 Module DocumentationParameters:pDesc - Returned array descriptorhArray - Array to get descriptor ofReturns:Note:CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_-ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_INVALID_HANDLESee also:Note that this function may also return error codes from previous, asynchronous launches.cuArray3DCreate, cuArray3DGetDescriptor, cuArrayCreate, cuArrayDestroy, cuMemAlloc, cuMemAllocHost,cuMemAllocPitch, cuMemcpy2D, cuMemcpy2DAsync, cuMemcpy2DUnaligned, cuMemcpy3D, cuMemcpy3DAsync,cuMemcpyAtoA, cuMemcpyAtoD, cuMemcpyAtoH, cuMemcpyAtoHAsync, cuMemcpyDtoA,cuMemcpyDtoD, cuMemcpyDtoH, cuMemcpyDtoHAsync, cuMemcpyHtoA, cuMemcpyHtoAAsync,cuMemcpyHtoD, cuMemcpyHtoDAsync, cuMemFree, cuMemFreeHost, cuMemGetAddressRange, cuMemGet-Info, cuMemHostAlloc, cuMemHostGetDevicePointer, cuMemsetD2D8, cuMemsetD2D16, cuMemsetD2D32,cuMemsetD8, cuMemsetD16, cuMemsetD323.25.2.6 CUresult cuMemAlloc (CUdeviceptr ∗ dptr, unsigned int bytesize)Allocates bytesize bytes of linear memory on the device and returns in ∗dptr a pointer to the allocated memory.The allocated memory is suitably aligned for any kind of variable. The memory is not cleared. If bytesize is 0,cuMemAlloc() returns CUDA_ERROR_INVALID_VALUE.Parameters:dptr - Returned device pointerbytesize - Requested allocation size in bytesReturns:Note:CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_-ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_OUT_OF_MEMORYSee also:Note that this function may also return error codes from previous, asynchronous launches.cuArray3DCreate, cuArray3DGetDescriptor, cuArrayCreate, cuArrayDestroy, cuArrayGetDescriptor, cuMemAllocHost,cuMemAllocPitch, cuMemcpy2D, cuMemcpy2DAsync, cuMemcpy2DUnaligned, cuMemcpy3D,cuMemcpy3DAsync, cuMemcpyAtoA, cuMemcpyAtoD, cuMemcpyAtoH, cuMemcpyAtoHAsync, cuMemcpy-DtoA, cuMemcpyDtoD, cuMemcpyDtoH, cuMemcpyDtoHAsync, cuMemcpyHtoA, cuMemcpyHtoAAsync,cuMemcpyHtoD, cuMemcpyHtoDAsync, cuMemFree, cuMemFreeHost, cuMemGetAddressRange, cuMemGet-Info, cuMemHostAlloc, cuMemHostGetDevicePointer, cuMemsetD2D8, cuMemsetD2D16, cuMemsetD2D32,cuMemsetD8, cuMemsetD16, cuMemsetD32Generated on Wed Apr 1 16:11:42 2009 for NVIDIA CUDA Library by Doxygen
3.25 Memory Management 1353.25.2.7 CUresult cuMemAllocHost (void ∗∗ pp, unsigned int bytesize)Allocates bytesize bytes of host memory that is page-locked and accessible to the device. The driver tracks the virtualmemory ranges allocated with this function and automatically accelerates calls to functions such as cuMemcpy().Since the memory can be accessed directly by the device, it can be read or written with much higher bandwidth thanpageable memory obtained with functions such as malloc(). Allocating excessive amounts of memory with cuMemAllocHost()may degrade system performance, since it reduces the amount of memory available to the system for paging.As a result, this function is best used sparingly to allocate staging areas for data exchange between host and device.Parameters:pp - Returned host pointer to page-locked memorybytesize - Requested allocation size in bytesReturns:Note:CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_-ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_OUT_OF_MEMORYSee also:Note that this function may also return error codes from previous, asynchronous launches.cuArray3DCreate, cuArray3DGetDescriptor, cuArrayCreate, cuArrayDestroy, cuArrayGetDescriptor, cuMemAlloc,cuMemAllocPitch, cuMemcpy2D, cuMemcpy2DAsync, cuMemcpy2DUnaligned, cuMemcpy3D, cuMemcpy3DAsync,cuMemcpyAtoA, cuMemcpyAtoD, cuMemcpyAtoH, cuMemcpyAtoHAsync, cuMemcpyDtoA,cuMemcpyDtoD, cuMemcpyDtoH, cuMemcpyDtoHAsync, cuMemcpyHtoA, cuMemcpyHtoAAsync,cuMemcpyHtoD, cuMemcpyHtoDAsync, cuMemFree, cuMemFreeHost, cuMemGetAddressRange, cuMemGet-Info, cuMemHostAlloc, cuMemHostGetDevicePointer, cuMemsetD2D8, cuMemsetD2D16, cuMemsetD2D32,cuMemsetD8, cuMemsetD16, cuMemsetD323.25.2.8 CUresult cuMemAllocPitch (CUdeviceptr ∗ dptr, unsigned int ∗ pPitch, unsigned int WidthInBytes,unsigned int Height, unsigned int ElementSizeBytes)Allocates at least WidthInBytes ∗ Height bytes of linear memory on the device and returns in ∗dptr a pointerto the allocated memory. The function may pad the allocation to ensure that corresponding pointers in any givenrow will continue to meet the alignment requirements for coalescing as the address is updated from row to row.ElementSizeBytes specifies the size of the largest reads and writes that will be performed on the memory range.ElementSizeBytes may be 4, 8 or 16 (since coalesced memory transactions are not possible on other data sizes). IfElementSizeBytes is smaller than the actual read/write size of a kernel, the kernel will run correctly, but possiblyat reduced speed. The pitch returned in ∗pPitch by cuMemAllocPitch() is the width in bytes of the allocation. Theintended usage of pitch is as a separate parameter of the allocation, used to compute addresses within the 2D array.Given the row and column of an array element of type T, the address is computed as:T* pElement = (T*)((char*)BaseAddress + Row * Pitch) + Column;The pitch returned by cuMemAllocPitch() is guaranteed to work with cuMemcpy2D() under all circumstances. Forallocations of 2D arrays, it is recommended that programmers consider performing pitch allocations using cuMemAllocPitch().Due to alignment restrictions in the hardware, this is especially true if the application will be performing2D memory copies between different regions of device memory (whether linear memory or CUDA arrays).Generated on Wed Apr 1 16:11:42 2009 for NVIDIA CUDA Library by Doxygen
Page 1:
NVIDIA CUDAReference ManualVersion
Page 4 and 5:
iiCONTENTS3.5 Stream Management . .
Page 6 and 7:
ivCONTENTS3.10 Direct3D 9 Interoper
Page 8 and 9:
viCONTENTS3.15.2.2 cudaComputeMode
Page 10 and 11:
viiiCONTENTS3.24.2.11 cuParamSetv .
Page 12 and 13:
xCONTENTS3.28.2.3 cuD3D9GetDirect3D
Page 14 and 15:
xiiCONTENTS4.3.1 Detailed Descripti
Page 16 and 17:
2 Module IndexGenerated on Wed Apr
Page 18 and 19:
4 Data Structure IndexGenerated on
Page 20 and 21:
6 Module Documentation3.2 Thread Ma
Page 22 and 23:
8 Module DocumentationNote:Note tha
Page 24 and 25:
10 Module Documentation3.4.2.2 cuda
Page 26 and 27:
12 Module DocumentationParameters:-
Page 28 and 29:
14 Module Documentation3.5 Stream M
Page 30 and 31:
16 Module Documentation3.6 Event Ma
Page 32 and 33:
18 Module Documentationstart - Star
Page 34 and 35:
20 Module Documentation3.7 Executio
Page 36 and 37:
22 Module DocumentationConverts the
Page 38 and 39:
24 Module DocumentationCopies data
Page 40 and 41:
26 Module DocumentationNote:See als
Page 42 and 43:
28 Module Documentationdegrade syst
Page 44 and 45:
30 Module Documentationextent - Req
Page 46 and 47:
32 Module DocumentationParameters:p
Page 48 and 49:
34 Module DocumentationReturns:Note
Page 50 and 51:
36 Module Documentationdirection of
Page 52 and 53:
38 Module DocumentationNote:See als
Page 54 and 55:
40 Module DocumentationcudaMemcpy3D
Page 56 and 57:
42 Module Documentation3.8.2.24 cud
Page 58 and 59:
44 Module Documentation3.8.2.27 cud
Page 60 and 61:
46 Module DocumentationSee also:cud
Page 62 and 63:
Page 64 and 65:
50 Module DocumentationExtents with
Page 66 and 67:
Page 68 and 69:
54 Module Documentation3.10 Direct3
Page 70 and 71:
56 Module DocumentationpszAdapterNa
Page 72 and 73:
58 Module Documentation• Resource
Page 74 and 75:
Page 76 and 77:
62 Module DocumentationReturns:Note
Page 78 and 79:
64 Module DocumentationcudaD3D9Reso
Page 80 and 81:
66 Module Documentation• cudaErro
Page 82 and 83:
68 Module DocumentationThis call is
Page 84 and 85:
70 Module DocumentationParameters:p
Page 86 and 87:
72 Module DocumentationpResource -
Page 88 and 89:
74 Module DocumentationcudaD3D10Res
Page 90 and 91:
76 Module Documentationtex1Dfetch()
Page 92 and 93:
78 Module Documentationz - Z compon
Page 94 and 95:
80 Module Documentation3.13 Version
Page 96 and 97:
82 Module Documentation• template
Page 98 and 99: 84 Module DocumentationSee also:cud
Page 100 and 101: 86 Module Documentation3.14.2.8 tem
Page 102 and 103: 88 Module DocumentationSee also:cud
Page 104 and 105: 90 Module DocumentationcudaErrorUnm
Page 106 and 107: 92 Module DocumentationEnumerator:c
Page 108 and 109: 94 Module Documentation3.16 CUDA Dr
Page 110 and 111: 96 Module Documentation3.18 Device
Page 112 and 113: 98 Module Documentation• CU_DEVIC
Page 114 and 115: 100 Module Documentationtypedef str
Page 116 and 117: 102 Module Documentation3.19 Versio
Page 118 and 119: 104 Module DocumentationReturns:Not
Page 120 and 121: 106 Module DocumentationNote:See al
Page 122 and 123: 108 Module Documentation3.21 Module
Page 126 and 127: 112 Module DocumentationnumOptions
Page 128 and 129: 114 Module Documentation3.22 Stream
Page 130 and 131: 116 Module Documentation3.23 Event
Page 132 and 133: 118 Module Documentation3.23.2.4 CU
Page 134 and 135: 120 Module Documentation3.24 Execut
Page 138 and 139: 124 Module DocumentationParameters:
Page 142 and 143: 128 Module Documentation• CUresul
Page 144 and 145: 130 Module Documentationtypedef enu
Page 146 and 147: 132 Module Documentationwhere:• W
Page 152 and 153: 138 Module DocumentationFor CUDA ar
Page 154 and 155: 140 Module DocumentationFor device
Page 156 and 157: 142 Module DocumentationCUdeviceptr
Page 158 and 159: 144 Module DocumentationIf dstMemor
Page 160 and 161: 146 Module DocumentationIf srcMemor
Page 162 and 163: 148 Module DocumentationcuMemcpy3D,
Page 164 and 165: 150 Module Documentation3.25.2.18 C
Page 166 and 167: 152 Module Documentationbytes - Siz
Page 168 and 169: 154 Module DocumentationNote:See al
Page 174 and 175: 160 Module Documentationui - Value
Page 176 and 177: 162 Module Documentation3.26 Textur
Page 184 and 185: 170 Module Documentation3.27 OpenGL
Page 188 and 189: 174 Module Documentation3.28 Direct
Page 192 and 193: 178 Module Documentation• The pri
Page 194 and 195: 180 Module DocumentationNote:Note t
Page 196 and 197: 182 Module DocumentationpHeight - R
Page 198 and 199:
184 Module DocumentationNote:See al
Page 200 and 201:
186 Module Documentation3.29.2 Func
Page 202 and 203:
188 Module Documentation• ID3D10T
Page 204 and 205:
190 Module DocumentationFor usage r
Page 206 and 207:
192 Module DocumentationIf pResourc
Page 208 and 209:
194 Module DocumentationParameters:
Page 210 and 211:
196 Module DocumentationCUDA_ERROR_
Page 212 and 213:
198 Module Documentation• typedef
Page 214 and 215:
200 Module Documentation3.30.2.3 ty
Page 216 and 217:
202 Module Documentation3.30.3.4 en
Page 218 and 219:
204 Module Documentation3.30.3.8 en
Page 220 and 221:
206 Module DocumentationGenerated o
Page 222 and 223:
208 Data Structure Documentation4.2
Page 224 and 225:
Page 226 and 227:
212 Data Structure DocumentationSou
Page 228 and 229:
Page 230 and 231:
Page 232 and 233:
Page 234 and 235:
Page 236 and 237:
IndexC++ API Routines, 81Context Ma
Page 238 and 239:
224 INDEXCUDA_TYPES, 200CUctx_flags
Page 240 and 241:
226 INDEXCU_FUNC_ATTRIBUTE_MAX_THRE
Page 242 and 243:
228 INDEXcudaD3D9ResourceSetMapFlag
Page 244 and 245:
230 INDEXcudaMemcpyToArrayCUDART_ME
Page 246 and 247:
232 INDEXcudaStream_t, 91CUDART_VER
Page 248 and 249:
234 INDEXcuMemcpyAtoACUMEM, 147cuMe
Page 250:
NoticeALL NVIDIA DESIGN SPECIFICATI
show all

NVIDIA CUDA

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?