13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

STACK ALIGNMENTThe solution to this problem is to have the function’s entry point assume only 4-bytealignment. If the function has a need for 8-byte or 16-byte alignment, then code canbe inserted to dynamically align the stack appropriately, resulting in one of the stackframes shown in Figure 4-1.ESP-based Aligned FrameEBP-based Aligned FrameParametersParametersReturn AddressPaddingParameterPointerReturn AddressPaddingParameterPointerRegister Save AreaReturn Address 1Local Variables <strong>and</strong>Spill Slots__cdecl ParameterPassing Space__stdcall ParameterPassing SpaceESPPrevious EBPSEH/CEH RecordLocal Variables <strong>and</strong>Spill SlotsEBP-frame SavedRegister AreaParameter PassingSpaceEBPESPFigure 4-1. Stack Frames Based on Alignment TypeAs an optimization, an alternate entry point can be created that can be called whenproper stack alignment is provided by the caller. Using call graph profiling of theVTune analyzer, calls to the normal (unaligned) entry point can be optimized intocalls to the (alternate) aligned entry point when the stack can be proven to be properlyaligned. Furthermore, a function alignment requirement attribute can be modifiedthroughout the call graph so as to cause the least number of calls to unalignedentry points.As an example of this, suppose function F has only a stack alignment requirement of4, but it calls function G at many call sites, <strong>and</strong> in a loop. If G’s alignment requirementis 16, then by promoting F’s alignment requirement to 16, <strong>and</strong> making all callsto G go to its aligned entry point, the compiler can minimize the number of times thatcontrol passes through the unaligned entry points. Example D-1 <strong>and</strong> Example D-2 inthe following sections illustrate this technique. Note the entry points foo <strong>and</strong>foo.aligned; the latter is the alternate aligned entry point.D-2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!