03.03.2013 Views

Intel® Architecture Instruction Set Extensions Programming Reference

Intel® Architecture Instruction Set Extensions Programming Reference

Intel® Architecture Instruction Set Extensions Programming Reference

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

INSTRUCTION SET REFERENCE<br />

VGATHERDPS/VGATHERQPS — Gather Packed SP FP values Using Signed Dword/Qword Indices<br />

Opcode/<br />

<strong>Instruction</strong><br />

Description<br />

Op/<br />

En<br />

64/32<br />

-bit<br />

Mode<br />

CPUID<br />

Feature<br />

Flag<br />

Description<br />

VEX.DDS.128.66.0F38.W0 92 /r A V/V AVX2 Using dword indices specified in vm32x, gather single-precision<br />

VGATHERDPS xmm1, vm32x, xmm2<br />

FP values from memory conditioned on mask specified by<br />

xmm2. Conditionally gathered elements are merged into xmm1.<br />

VEX.DDS.128.66.0F38.W0 93 /r A V/V AVX2 Using qword indices specified in vm64x, gather single-precision<br />

VGATHERQPS xmm1, vm64x, xmm2<br />

FP values from memory conditioned on mask specified by<br />

xmm2. Conditionally gathered elements are merged into xmm1.<br />

VEX.DDS.256.66.0F38.W0 92 /r A V/V AVX2 Using dword indices specified in vm32y, gather single-precision<br />

VGATHERDPS ymm1, vm32y, ymm2<br />

FP values from memory conditioned on mask specified by<br />

ymm2. Conditionally gathered elements are merged into ymm1.<br />

VEX.DDS.256.66.0F38.W0 93 /r A V/V AVX2 Using qword indices specified in vm64y, gather single-precision<br />

VGATHERQPS xmm1, vm64y, xmm2<br />

FP values from memory conditioned on mask specified by<br />

xmm2. Conditionally gathered elements are merged into xmm1.<br />

<strong>Instruction</strong> Operand Encoding<br />

Op/En Operand 1 Operand 2 Operand 3 Operand 4<br />

A ModRM:reg (r,w) BaseReg (R): VSIB:base,<br />

VectorReg(R): VSIB:index<br />

VEX.vvvv (r, w) NA<br />

The instruction conditionally loads up to 4 or 8 single-precision floating-point values from memory addresses specified<br />

by the memory operand (the second operand) and using dword indices. The memory operand uses the VSIB<br />

form of the SIB byte to specify a general purpose register operand as the common base, a vector register for an<br />

array of indices relative to the base and a constant scale factor.<br />

The mask operand (the third operand) specifies the conditional load operation from each memory address and the<br />

corresponding update of each data element of the destination operand (the first operand). Conditionality is specified<br />

by the most significant bit of each data element of the mask register. If an element’s mask bit is not set, the<br />

corresponding element of the destination register is left unchanged. The width of data element in the destination<br />

register and mask register are identical. The entire mask register will be set to zero by this instruction unless the<br />

instruction causes an exception.<br />

Using qword indices, the instruction conditionally loads up to 2 or 4 single-precision floating-point values from the<br />

VSIB addressing memory operand, and updates the lower half of the destination register. The upper 128 or 256 bits<br />

of the destination register are zero’ed with qword indices.<br />

This instruction can be suspended by an exception if at least one element is already gathered (i.e., if the exception<br />

is triggered by an element other than the rightmost one with its mask bit set). When this happens, the destination<br />

register and the mask operand are partially updated; those elements that have been gathered are placed into the<br />

destination register and have their mask bits set to zero. If any traps or interrupts are pending from already gathered<br />

elements, they will be delivered in lieu of the exception; in this case, EFLAG.RF is set to one so an instruction<br />

breakpoint is not re-triggered when the instruction is continued.<br />

If the data size and index size are different, part of the destination register and part of the mask register do not<br />

correspond to any elements being gathered. This instruction sets those parts to zero. It may do this to one or both<br />

of those registers even if the instruction triggers an exception, and even if the instruction triggers the exception<br />

before gathering any elements.<br />

VEX.128 version: For dword indices, the instruction will gather four single-precision floating-point values. For<br />

qword indices, the instruction will gather two values and zeroes the upper 64 bits of the destination.<br />

VEX.256 version: For dword indices, the instruction will gather eight single-precision floating-point values. For<br />

qword indices, the instruction will gather four values and zeroes the upper 128 bits of the destination.<br />

5-208 Ref. # 319433-014

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!