13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

INDEXRratios, B-50branching <strong>and</strong> front end, B-52references, 1-3releases of, 2-48replay, B-2rounding control option, A-5rules, E-1Ssamplingevent-based, A-11scheduling distance (PSD), 8-17Self-modifying code, 3-63SFENCE Instruction, 8-10SHUFPS instruction, 6-3, 6-7signed unpack, 5-7SIMDauto-vectorization, 4-12cache instructions, 8-1classes, 4-11coding techniques, 4-7data alignment for MMX, 4-16data <strong>and</strong> stack alignment, 4-13data slignment for 128-bits, 4-16example computation, 2-45history, 2-45identifying hotspots, 4-6instruction selection, 4-25loop blocking, 4-23memory utilization, 4-18microarchitecture differences, 4-26MMX technology support, 4-2padding to align data, 4-14parallelism, 4-7SSE support, 4-2SSE2 support, 4-3SSE3 support, 4-3SSSE3 support, 4-4stack alignment for 128-bits, 4-15strip-mining, 4-22using arrays, 4-14vectorization, 4-7VTune capabilities, 4-6SIMD floating-point instructionscopying, shuffling, 6-12data arrangement, 6-3data deswizzling, 6-10data swizzling, 6-7different microarchitectures, 6-16general rules, 6-1horizontal ADD, 6-13Intel Core Duo processors, 6-22Intel Core Solo processors, 6-22planning considerations, 6-1reciprocal instructions, 6-1scalar code, 6-2SSE3 complex math, 6-18SSE3 FP programming, 6-17usingADDSUBPS, 6-19CVTTPS2PI, 6-16CVTTSS2SI, 6-16FXCH, 6-2HADDPS, 6-22HSUBPS, 6-22MOVAPD, 6-3MOVAPS, 6-3MOVHLPS, 6-13MOVHPS, 6-7, 6-10MOVLHPS, 6-13MOVLPS, 6-7, 6-10MOVSHDUP, 6-19MOVSLDUP, 6-19MOVUPD, 6-3MOVUPS, 6-3SHUFPS, 6-3, 6-7UNPACKHPS, 6-7UNPACKLPS, 6-7UNPCKHPS, 6-10UNPCKLPS, 6-10vertical vs horizontal computation, 6-3with x87 FP instructions, 6-2SIMD technology, 2-48SIMD-integer instructions<strong>64</strong>-bits to 128-bits, 5-36data alignment, 5-4data movement techniqes, 5-6extract word, 5-12insert word, 5-13integer intensive, 5-1memory optimizations, 5-31move byte mask to integer, 5-14optimization by architecture, 5-37packed average byte or word), 5-29packed multiply high unsigned, 5-28packed shuffle word, 5-15packed signed integer word maximum, 5-28packed sum of absolute differences, 5-28rules, 5-1signed unpack, 5-7unsigned unpack, 5-6usingEMMS, 5-2MOVDQ, 5-35MOVQ2DQ, 5-18PABSW, 5-20PACKSSDW, 5-8PADDQ, 5-30PALIGNR, 5-4PAVGB, 5-29PAVGW, 5-29PEXTRW, 5-12PINSRW, 5-13PMADDWD, 5-30Index-7

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!