13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

SUMMARY OF RULES AND SUGGESTIONSfour µops <strong>and</strong> require multiple cycles to decode. Use sequences of simpleinstructions instead. ............................................................................3-25Assembler/Compiler Coding Rule <strong>32</strong>. (M impact, H generality) INC <strong>and</strong> DECinstructions should be replaced with ADD or SUB instructions, because ADD <strong>and</strong>SUB overwrite all flags, whereas INC <strong>and</strong> DEC do not, therefore creating falsedependencies on earlier instructions that set the flags..............................3-26Assembler/Compiler Coding Rule 33. (ML impact, L generality) If an LEAinstruction using the scaled index is on the critical path, a sequence with ADDsmay be better. If code density <strong>and</strong> b<strong>and</strong>width out of the trace cache are thecritical factor, then use the LEA instruction. ............................................3-27Assembler/Compiler Coding Rule 34. (ML impact, L generality) Avoid ROTATEby register or ROTATE by immediate instructions. If possible, replace with aROTATE by 1 instruction.......................................................................3-27Assembler/Compiler Coding Rule 35. (M impact, ML generality) Usedependency-breaking-idiom instructions to set a register to 0, or to break a falsedependence chain resulting from re-use of registers. In contexts where thecondition codes must be preserved, move 0 into the register instead. Thisrequires more code space than using XOR <strong>and</strong> SUB, but avoids setting thecondition codes...................................................................................3-28Assembler/Compiler Coding Rule 36. (M impact, MH generality) Breakdependences on portions of registers between instructions by operating on <strong>32</strong>-bitregisters instead of partial registers. For moves, this can be accomplished with<strong>32</strong>-bit moves or by using MOVZX. .........................................................3-29Assembler/Compiler Coding Rule 37. (M impact, M generality) Try to use zeroextension or operate on <strong>32</strong>-bit oper<strong>and</strong>s instead of using moves with signextension. ..........................................................................................3-29Assembler/Compiler Coding Rule 38. (ML impact, L generality) Avoid placinginstructions that use <strong>32</strong>-bit immediates which cannot be encoded as signextended16-bit immediates near each other. Try to schedule µops that have noimmediate immediately before or after µops with <strong>32</strong>-bit immediates. .........3-29Assembler/Compiler Coding Rule 39. (ML impact, M generality) Use the TESTinstruction instead of AND when the result of the logical AND is not used. Thissaves µops in execution. Use a TEST if a register with itself instead of a CMP ofthe register to zero, this saves the need to encode the zero <strong>and</strong> saves encodingspace. Avoid comparing a constant to a memory oper<strong>and</strong>. It is preferable to loadthe memory oper<strong>and</strong> <strong>and</strong> compare the constant to a register. ...................3-30Assembler/Compiler Coding Rule 40. (ML impact, M generality) Eliminateunnecessary compare with zero instructions by using the appropriate conditionaljump instruction when the flags are already set by a preceding arithmeticinstruction. If necessary, use a TEST instruction instead of a compare. Be certainE-4

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!