03.03.2013 Views

Intel® Architecture Instruction Set Extensions Programming Reference

Intel® Architecture Instruction Set Extensions Programming Reference

Intel® Architecture Instruction Set Extensions Programming Reference

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

INSTRUCTION SET REFERENCE<br />

PSHUFB — Packed Shuffle Bytes<br />

Opcode/<br />

<strong>Instruction</strong><br />

Description<br />

<strong>Instruction</strong> Operand Encoding<br />

PSHUFB performs in-place shuffles of bytes in the first source operand according to the shuffle control mask in the<br />

second source operand. The instruction permutes the data in the first source operand, leaving the shuffle mask<br />

unaffected. If the most significant bit (bit[7]) of each byte of the shuffle control mask is set, then constant zero is<br />

written in the result byte. Each byte in the shuffle control mask forms an index to permute the corresponding byte<br />

in the first source operand. The value of each index is the least significant 4 bits of the shuffle control byte. The first<br />

source and destination operands are XMM registers. The second source is either an XMM register or a 128-bit<br />

memory location.<br />

128-bit Legacy SSE version: The first source and destination operands are the same. Bits (255:128) of the corresponding<br />

YMM destination register remain unchanged.<br />

VEX.128 encoded version: Bits (255:128) of the destination YMM register are zeroed.<br />

VEX.256 encoded version: Bits (255:128) of the destination YMM register stores the 16-byte shuffle result of the<br />

upper 16 bytes of the first source operand, using the upper 16-bytes of the second source operand as control mask.<br />

The value of each index is for the high 128-bit lane is the least significant 4 bits of the respective shuffle control<br />

byte. The index value selects a source data element within each 128-bit lane.<br />

Operation<br />

Op/<br />

En<br />

64/32<br />

-bit<br />

Mode<br />

CPUID<br />

Feature<br />

Flag<br />

Description<br />

66 0F 38 00 /r A V/V SSSE3 Shuffle bytes in xmm1 according to contents of xmm2/m128.<br />

PSHUFB xmm1, xmm2/m128<br />

VEX.NDS.128.66.0F38.WIG 00 /r B V/V AVX Shuffle bytes in xmm2 according to contents of xmm3/m128.<br />

VPSHUFB xmm1, xmm2,<br />

xmm3/m128<br />

VEX.NDS.256.66.0F38.WIG 00 /r B V/V AVX2 Shuffle bytes in ymm2 according to contents of ymm3/m256.<br />

VPSHUFB ymm1, ymm2,<br />

ymm3/m256<br />

Op/En Operand 1 Operand 2 Operand 3 Operand 4<br />

A ModRM:reg (r, w) ModRM:r/m (r) NA NA<br />

B ModRM:reg (w) VEX.vvvv ModRM:r/m (r) NA<br />

VPSHUFB (VEX.256 encoded version)<br />

for i = 0 to 15 {<br />

if (SRC2[(i * 8)+7] == 1 ) then<br />

DEST[(i*8)+7..(i*8)+0] 0;<br />

else<br />

index[3..0] SRC2[(i*8)+3 .. (i*8)+0];<br />

DEST[(i*8)+7..(i*8)+0] SRC1[(index*8+7)..(index*8+0)];<br />

endif<br />

if (SRC2[128 + (i * 8)+7] == 1 ) then<br />

DEST[128 + (i*8)+7..(i*8)+0] 0;<br />

else<br />

index[3..0] SRC2[128 + (i*8)+3 .. (i*8)+0];<br />

DEST[128 + (i*8)+7..(i*8)+0] SRC1[128 + (index*8+7)..(index*8+0)];<br />

endif<br />

5-120 Ref. # 319433-014

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!