13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

APPENDIX ESUMMARY OF RULES AND SUGGESTIONSThis appendix summarizes the rules <strong>and</strong> suggestions specified in this manual. Pleasebe reminded that coding recommendations are ranked in importance according tothese two criteria:• Local impact (referred to earlier as “impact”) – the difference that a recommendationmakes to performance for a given instance.• Generality – how frequently such instances occur across all application domains.Again, underst<strong>and</strong> that this ranking is intentionally very approximate, <strong>and</strong> can varydepending on coding style, application domain, <strong>and</strong> other factors. Throughout thechapter you observed references to these criteria using the high, medium <strong>and</strong> lowpriorities for each recommendation. In places where there was no priority assigned,the local impact or generality has been determined not to be applicable.E.1 ASSEMBLY/COMPILER CODING RULESAssembler/Compiler Coding Rule 1. (MH impact, M generality) Arrange codeto make basic blocks contiguous <strong>and</strong> eliminate unnecessary branches. .........3-7Assembler/Compiler Coding Rule 2. (M impact, ML generality) Use the SETCC<strong>and</strong> CMOV instructions to eliminate unpredictable conditional branches wherepossible. Do not do this for predictable branches. Do not use these instructions toeliminate all unpredictable conditional branches (because using these instructionswill incur execution overhead due to the requirement for executing both paths ofa conditional branch). In addition, converting a conditional branch to SETCC orCMOV trades off control flow dependence for data dependence <strong>and</strong> restricts thecapability of the out-of-order engine. When tuning, note that all Intel <strong>64</strong> <strong>and</strong><strong>IA</strong>-<strong>32</strong> processors usually have very high branch prediction rates. Consistentlymispredicted branches are generally rare. Use these instructions only if theincrease in computation time is less than the expected cost of a mispredictedbranch.................................................................................................3-7Assembler/Compiler Coding Rule 3. (M impact, H generality) Arrange code tobe consistent with the static branch prediction algorithm: make the fall-throughcode following a conditional branch be the likely target for a branch with a forwardtarget, <strong>and</strong> make the fall-through code following a conditional branch be theunlikely target for a branch with a backward target. ................................ 3-10Assembler/Compiler Coding Rule 4. (MH impact, MH generality) Near callsmust be matched with near returns, <strong>and</strong> far calls must be matched with farE-1

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!