03.03.2013 Views

Intel® Architecture Instruction Set Extensions Programming Reference

Intel® Architecture Instruction Set Extensions Programming Reference

Intel® Architecture Instruction Set Extensions Programming Reference

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

INSTRUCTION SET REFERENCE<br />

PSHUFLW — Shuffle Packed Low Words<br />

Opcode/<br />

<strong>Instruction</strong><br />

Description<br />

<strong>Instruction</strong> Operand Encoding<br />

Copies words from the low quadword of a 128-bit lane of the source operand and inserts them in the low quadword<br />

of the destination operand at word locations (of the respective lane) selected with the immediate operand. The<br />

256-bit operation is similar to the in-lane operation used by the 256-bit VPSHUFD instruction, which is illustrated<br />

in Figure 5-4. For 128-bit operation, only the low 128-bit lane is operative. Each 2-bit field in the immediate<br />

operand selects the contents of one word location in the low quadword of the destination operand. The binary<br />

encodings of the immediate operand fields select words (0, 1, 2 or 3) from the low quadword of the source operand<br />

to be copied to the destination operand. The high quadword of the source operand is copied to the high quadword<br />

of the destination operand, for each 128-bit lane.<br />

Note that this instruction permits a word in the low quadword of the source operand to be copied to more than one<br />

word location in the low quadword of the destination operand.<br />

Legacy SSE instructions: In 64-bit mode using a REX prefix in the form of REX.R permits this instruction to access<br />

additional registers (XMM8-XMM15).<br />

128-bit Legacy SSE version: The destination operand is an XMM register. The source operand can be an XMM<br />

register or a 128-bit memory location. Bits (255:128) of the corresponding YMM destination register remain<br />

unchanged.<br />

VEX.128 encoded version: The destination operand is an XMM register. The source operand can be an XMM register<br />

or a 128-bit memory location. Bits (255:128) of the corresponding YMM register are zeroed.<br />

VEX.256 encoded version: The destination operand is an YMM register. The source operand can be an YMM register<br />

or a 256-bit memory location.<br />

Operation<br />

Op/<br />

En<br />

64/32<br />

-bit<br />

Mode<br />

VPSHUFLW (VEX.256 encoded version)<br />

DEST[15:0] (SRC1 >> (imm[1:0] *16))[15:0]<br />

DEST[31:16] (SRC1 >> (imm[3:2] * 16))[15:0]<br />

DEST[47:32] (SRC1 >> (imm[5:4] * 16))[15:0]<br />

DEST[63:48] (SRC1 >> (imm[7:6] * 16))[15:0]<br />

DEST[127:64] SRC1[127:64]<br />

DEST[143:128] (SRC1 >> (imm[1:0] *16))[143:128]<br />

DEST[159:144] (SRC1 >> (imm[3:2] * 16))[143:128]<br />

DEST[175:160] (SRC1 >> (imm[5:4] * 16))[143:128]<br />

CPUID<br />

Feature<br />

Flag<br />

Description<br />

F2 0F 70 /r ib A V/V SSE2 Shuffle the low words in xmm2/m128 based on the encoding in<br />

imm8 and store the result in xmm1.<br />

PSHUFLW xmm1, xmm2/m128,<br />

imm8<br />

VEX.128.F2.0F.WIG 70 /r ib A V/V AVX Shuffle the low words in xmm2/m128 based on the encoding in<br />

imm8 and store the result in xmm1.<br />

VPSHUFLW xmm1,<br />

xmm2/m128, imm8<br />

VEX.256.F2.0F.WIG 70 /r ib A V/V AVX2 Shuffle the low words in ymm2/m256 based on the encoding in<br />

imm8 and store the result in ymm1.<br />

VPSHUFLW ymm1,<br />

ymm2/m256, imm8<br />

Op/En Operand 1 Operand 2 Operand 3 Operand 4<br />

A ModRM:reg (w) ModRM:r/m (r) NA NA<br />

5-126 Ref. # 319433-014

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!