03.03.2013 Views

Intel® Architecture Instruction Set Extensions Programming Reference

Intel® Architecture Instruction Set Extensions Programming Reference

Intel® Architecture Instruction Set Extensions Programming Reference

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

SYSTEM PROGRAMMING MODEL<br />

CHAPTER 3<br />

SYSTEM PROGRAMMING MODEL<br />

This chapter describes the operating system programming considerations for AVX, F16C, AVX2 and FMA. The AES<br />

extensions and PCLMULQDQ instruction follow the same system software requirements for XMM state support and<br />

SIMD floating-point exception support as SSE2, SSE3, SSSE3, SSE4 (see Chapter 12 of IA-32 Intel <strong>Architecture</strong><br />

Software Developer’s Manual, Volumes 3A).<br />

The AVX, F16C, AVX2 and FMA extensions operate on 256-bit YMM registers, and require operating system to<br />

supports processor extended state management using XSAVE/XRSTOR instructions. VAESDEC/VAESDE-<br />

CLAST/VAESENC/VAESENCLAST/VAESIMC/VAESKEYGENASSIST/VPCLMULQDQ follow the same system programming<br />

requirements as AVX and FMA instructions operating on YMM states.<br />

The basic requirements for an operating system using XSAVE/XRSTOR to manage processor extended states for<br />

current and future Intel <strong>Architecture</strong> processors can be found in Chapter 12 of IA-32 Intel <strong>Architecture</strong> Software<br />

Developer’s Manual, Volumes 3A. This chapter covers additional requirements for OS to support YMM state.<br />

3.1 YMM STATE, VEX PREFIX AND SUPPORTED OPERATING MODES<br />

AVX, F16C, AVX2 and FMA instructions operates on YMM states and requires VEX prefix encoding. SIMD instructions<br />

operating on XMM states (i.e. not accessing the upper 128 bits of YMM) generally do not use VEX prefix. Not<br />

all instructions that require VEX prefix encoding need YMM or XMM registers as operands.<br />

For processors that support YMM states, the YMM state exists in all operating modes. However, the available interfaces<br />

to access YMM states may vary in different modes. The processor's support for instruction extensions that<br />

employ VEX prefix encoding is independent of the processor's support for YMM state.<br />

<strong>Instruction</strong>s requiring VEX prefix encoding generally are supported in 64-bit, 32-bit modes, and 16-bit protected<br />

mode. They are not supported in Real mode, Virtual-8086 mode or entering into SMM mode.<br />

Note that bits 255:128 of YMM register state are maintained across transitions into and out of these modes.<br />

Because, XSAVE/XRSTOR instruction can operate in all operating modes, it is possible that the processor's YMM<br />

register state can be modified by software in any operating mode by executing XRSTOR. The YMM registers can be<br />

updated by XRSTOR using the state information stored in the XSAVE/XRSTOR area residing in memory.<br />

3.2 YMM STATE MANAGEMENT<br />

Operating systems must use the XSAVE/XRSTOR instructions for YMM state management. The XSAVE/XRSTOR<br />

instructions also provide flexible and efficient interface to manage XMM/MXCSR states and x87 FPU states in<br />

conjunction with new processor extended states.<br />

An OS must enable its YMM state management to support AVX and FMA extensions. Otherwise, an attempt to<br />

execute an instruction in AVX or FMA extensions (including an enhanced 128-bit SIMD instructions using VEX<br />

encoding) will cause a #UD exception.<br />

3.2.1 Detection of YMM State Support<br />

Detection of hardware support for new processor extended state is provided by the main leaf of CPUID leaf function<br />

0DH with index ECX = 0. Specifically, the return value in EDX:EAX of CPUID.(EAX=0DH, ECX=0) provides a 64-bit<br />

wide bit vector of hardware support of processor state components, beginning with bit 0 of EAX corresponding to<br />

x87 FPU state, CPUID.(EAX=0DH, ECX=0):EAX[1] corresponding to SSE state (XMM registers and MXCSR),<br />

CPUID.(EAX=0DH, ECX=0):EAX[2] corresponding to YMM states.<br />

3.2.2 Enabling of YMM State<br />

An OS can enable YMM state support with the following steps:<br />

Ref. # 319433-014 3-1

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!