13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

GENERAL OPTIMIZATION GUIDELINESUse TEST when comparing the result of a logical AND with an immediate constant forequality or inequality if the register is EAX for cases such as:IF (AVAR & 8) { }The TEST instruction can also be used to detect rollover of modulo of a power of 2.For example, the C code:IF ( (AVAR % 16) == 0 ) { }can be implemented using:TEST EAX, 0x0FJNZ AfterIfUsing the TEST instruction between the instruction that may modify part of the flagregister <strong>and</strong> the instruction that uses the flag register can also help prevent partialflag register stall.Assembly/Compiler Coding Rule 39. (ML impact, M generality) Use the TESTinstruction instead of AND when the result of the logical AND is not used. This savesµops in execution. Use a TEST if a register with itself instead of a CMP of the registerto zero, this saves the need to encode the zero <strong>and</strong> saves encoding space. Avoidcomparing a constant to a memory oper<strong>and</strong>. It is preferable to load the memoryoper<strong>and</strong> <strong>and</strong> compare the constant to a register.Often a produced value must be compared with zero, <strong>and</strong> then used in a branch.Because most Intel architecture instructions set the condition codes as part of theirexecution, the compare instruction may be eliminated. Thus the operation can betested directly by a JCC instruction. The notable exceptions are MOV <strong>and</strong> LEA. Inthese cases, use TEST.Assembly/Compiler Coding Rule 40. (ML impact, M generality) Eliminateunnecessary compare with zero instructions by using the appropriate conditionaljump instruction when the flags are already set by a preceding arithmeticinstruction. If necessary, use a TEST instruction instead of a compare. Be certainthat any code transformations made do not introduce problems with overflow.3.5.1.8 Using NOPsCode generators generate a no-operation (NOP) to align instructions. Examples ofNOPs of different lengths in <strong>32</strong>-bit mode are shown below:1-byte: XCHG EAX, EAX2-byte: MOV REG, REG3-byte: LEA REG, 0 (REG) (8-bit displacement)4-byte: NOP DWORD PTR [EAX + 0] (8-bit displacement)5-byte: NOP DWORD PTR [EAX + EAX*1 + 0] (8-bit displacement)6-byte: LEA REG, 0 (REG) (<strong>32</strong>-bit displacement)7-byte: NOP DWORD PTR [EAX + 0] (<strong>32</strong>-bit displacement)8-byte: NOP DWORD PTR [EAX + EAX*1 + 0] (<strong>32</strong>-bit displacement)9-byte: NOP WORD PTR [EAX + EAX*1 + 0] (<strong>32</strong>-bit displacement)3-30

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!