13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

MULTICORE AND HYPER-THREADING TECHNOLOGYcall a timing service API, such as Sleep(0), may be ineffective in minimizing the costof thread synchronization. Because the control thread still behaves like a fast spinningloop, the only runnable worker thread must share execution resources with thespin-wait loop if both are running on the same physical processor that supports HTTechnology. If there are more than one runnable worker threads, then calling athread blocking API, such as Sleep(0), could still release the processor running thespin-wait loop, allowing the processor to be used by another worker thread instead ofthe spinning loop.A control thread waiting for the completion of worker threads can usually implementthread synchronization using a thread-blocking API or a timing service, if the workerthreads require significant time to complete. Example 8-5(b) shows an example thatreduces the overhead of the control thread in its thread synchronization.Example 8-5. Coding Pitfall using Spin Wait Loop(a) A spin-wait loop attempts to release the processor incorrectly. It experiences a performancepenalty if the only worker thread <strong>and</strong> the control thread runs on the same physical processorpackage.// Only one worker thread is running,// the control loop waits for the worker thread to complete.ResumeWorkThread(thread_h<strong>and</strong>le);While (!task_not_done ) {Sleep(0) // Returns immediately back to spin loop.…}(b) A polling loop frees up the processor correctly.// Let a worker thread run <strong>and</strong> wait for completion.ResumeWorkThread(thread_h<strong>and</strong>le);While (!task_not_done ) {Sleep(FIVE_MILISEC)// This processor is released for some duration, the processor// can be used by other threads.…}In general, OS function calls should be used with care when synchronizing threads.When using OS-supported thread synchronization objects (critical section, mutex, orsemaphore), preference should be given to the OS service that has the leastsynchronization overhead, such as a critical section.8-20

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!