13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

SUMMARY OF RULES AND SUGGESTIONSloop body contains more than one conditional branch, then unroll so that thenumber of iterations is 16/(# conditional branches)................................. 3-16Assembler/Compiler Coding Rule 18. (ML impact, M generality) For improvingfetch/decode throughput, Give preference to memory flavor of an instruction overthe register-only flavor of the same instruction, if such instruction can benefitfrom micro-fusion. .............................................................................. 3-17Assembler/Compiler Coding Rule 19. (M impact, ML generality) Employmacro-fusion where possible using instruction pairs that support macro-fusion.Prefer TEST over CMP if possible. Use unsigned variables <strong>and</strong> unsigned jumpswhen possible. Try to logically verify that a variable is non-negative at the timeof comparison. Avoid CMP or TEST of MEM-IMM flavor when possible. However,do not add other instructions to avoid using the MEM-IMM flavor. .............. 3-19Assembler/Compiler Coding Rule 20. (M impact, ML generality) Software canenable macro fusion when it can be logically determined that a variable is nonnegativeat the time of comparison; use TEST appropriately to enable macrofusionwhen comparing a variable with 0. ............................................... 3-21Assembler/Compiler Coding Rule 21. (MH impact, MH generality) Favorgenerating code using imm8 or imm<strong>32</strong> values instead of imm16 values...... 3-22Assembler/Compiler Coding Rule 22. (M impact, ML generality) Ensureinstructions using 0xF7 opcode byte does not start at offset 14 of a fetch line; <strong>and</strong>avoid using these instruction to operate on 16-bit data, upcast short data to <strong>32</strong>bits. .................................................................................................. 3-23Assembler/Compiler Coding Rule 23. (MH impact, MH generality) Break upa loop long sequence of instructions into loops of shorter instruction blocks of nomore than 18 instructions. ................................................................... 3-23Assembler/Compiler Coding Rule 24. (MH impact, M generality) Avoidunrolling loops containing LCP stalls, if the unrolled block exceeds 18 instructions.3-23Assembler/Compiler Coding Rule 25. (M impact, M generality) Avoid puttingexplicit references to ESP in a sequence of stack operations (POP, PUSH, CALL,RET). ................................................................................................ 3-24Assembler/Compiler Coding Rule 26. (ML impact, L generality) Use simpleinstructions that are less than eight bytes in length. ................................ 3-24Assembler/Compiler Coding Rule 27. (M impact, MH generality) Avoid usingprefixes to change the size of immediate <strong>and</strong> displacement. ..................... 3-24Assembler/Compiler Coding Rule 28. (M impact, H generality) Favor singlemicro-operationinstructions. Also favor instruction with shorter latencies. .. 3-25Assembler/Compiler Coding Rule 29. (M impact, L generality) Avoid prefixes,especially multiple non-0F-prefixed opcodes. .......................................... 3-25Assembler/Compiler Coding Rule 30. (M impact, L generality) Do not usemany segment registers. ..................................................................... 3-25Assembler/Compiler Coding Rule 31. (ML impact, M generality) Avoid usingcomplex instructions (for example, enter, leave, or loop) that have more thanE-3

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!