13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

MULTICORE AND HYPER-THREADING TECHNOLOGYCharacteristicsRecommendeduse conditionsTable 8-1. Properties of Synchronization Objects (Contd.)Operating SystemSynchronization Objects• Number of active threads isgreater than number ofcores• Waiting thous<strong>and</strong>s of cyclesfor a signal• Synchronization amongprocessesLight Weight UserSynchronization•Number of activethreads is less thanor equal to numberof cores•Infrequentcontention• Need inter processsynchronizationSynchronizationObject based onMONITOR/MWAIT• Same as light weightobjects• MONITOR/MWAITavailable8.4.2 Synchronization for Short PeriodsThe frequency <strong>and</strong> duration that a thread needs to synchronize with other threadsdepends application characteristics. When a synchronization loop needs very fastresponse, applications may use a spin-wait loop.A spin-wait loop is typically used when one thread needs to wait a short amount oftime for another thread to reach a point of synchronization. A spin-wait loop consistsof a loop that compares a synchronization variable with some pre-defined value. SeeExample 8-4(a).On a modern microprocessor with a superscalar speculative execution engine, a looplike this results in the issue of multiple simultaneous read requests from the spinningthread. These requests usually execute out-of-order with each read request beingallocated a buffer resource. On detection of a write by a worker thread to a load thatis in progress, the processor must guarantee no violations of memory order occur.The necessity of maintaining the order of outst<strong>and</strong>ing memory operations inevitablycosts the processor a severe penalty that impacts all threads.This penalty occurs on the Pentium M processor, the Intel Core Solo <strong>and</strong> Intel CoreDuo processors. However, the penalty on these processors is small compared withpenalties suffered on the Pentium 4 <strong>and</strong> Intel Xeon processors. There the performancepenalty for exiting the loop is about 25 times more severe.On a processor supporting HT Technology, spin-wait loops can consume a significantportion of the execution b<strong>and</strong>width of the processor. One logical processor executinga spin-wait loop can severely impact the performance of the other logical processor.8-16

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!