13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

INDEXMMASKMOVDQU instruction, 8-7memory bank conflicts, 8-2memory optimizationsloading-storing to-from same DRAM page, 5-36overview, 5-31partial memory accesses, 5-<strong>32</strong>performance, 4-18reference instructions, 3-27using aligned stores, 5-36using prefetch, 8-12MFENCE instruction, 8-11micro-op fusion, 2-36misaligned data access, 4-13misalignment in the FIR filter, 4-15mobile computingACPI st<strong>and</strong>ard, 10-1, 10-3active power, 10-1battery life, 10-1, 10-5, 10-6C4-state, 10-4CD/DVD, WLAN, WiFi, 10-7C-states, 10-1, 10-3deep sleep transitions, 10-7deeper sleep, 10-4, 10-10Intel Mobil Platform SDK, 10-6OS APIs, 10-6OS changes processor frequency, 10-2OS synchronization APIs, 10-6overview, 10-1performance options, 10-5platform optimizations, 10-7P-states, 10-1Speedstep technology, 10-8spin-loops, 10-6state transitions, 10-2static power, 10-1WM_POWERBROADCAST message, 10-7MOVAPD instruction, 6-3MOVAPS instruction, 6-3MOVDDUP instruction, 6-17move byte mask to integer, 5-14MOVHLPS instruction, 6-13MOVHPS instruction, 6-7, 6-10MOVLHPS instruction, 6-13MOVLPS instruction, 6-7, 6-10MOVNTDQ instruction, 8-7MOVNTI instruction, 8-7MOVNTPD instruction, 8-7MOVNTPS instruction, 8-7MOVNTQ instruction, 8-7MOVQ Instruction, 5-35MOVSHDUP instruction, 6-17, 6-19MOVSLDUP instruction, 6-17, 6-19MOVUPD instruction, 6-3MOVUPS instruction, 6-3multicore processorsarchitecture, 2-1C-state considerations, 10-12energy considerations, 10-10features of, 2-41functional example, 2-41pipeline <strong>and</strong> core, 2-43SpeedStep technology, 10-11thread migration, 10-11multiprocessor systemsdual-core processors, 7-1HT Technology, 7-1optimization techniques, 7-1See also: multithreading & Hyper-ThreadingTechnologymultithreadingAmdahl’s law, 7-2application tools, 7-10bus optimization, 7-12compiler support, A-5dual-core technology, 3-5environment description, 7-1guidelines, 7-11hardware support, 3-5HT technology, 3-5Intel Core microarchitecture, 7-6parallel & sequential tasks, 7-2programming models, 7-4shared execution resources, 7-41specialized models, 7-6thread sync practices, 7-12See Hyper-Threading TechnologyNNewton-Raphson iteration, 6-1non-coherent requests, 8-9non-halted clock ticks, B-4non-interleaved unpack, 5-10non-sleep clock ticks, B-4non-temporal stores, 8-8, 8-30NOP, 3-30OOpenMP compiler directives, 7-10, A-5optimizationbranch prediction, 3-6branch type selection, 3-13eliminating branches, 3-7features, 2-1general techniques, 3-1spin-wait <strong>and</strong> idle loops, 3-9static prediction, 3-9unrolling loops, 3-15optimizing cache utilizationcache management, 8-31examples, 8-11non-temporal store instructions, 8-7prefetch <strong>and</strong> load, 8-6prefetch instructions, 8-5Index-5

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!