13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

MULTICORE AND HYPER-THREADING TECHNOLOGY8.2.3 Specialized Programming ModelsIntel Core Duo processor <strong>and</strong> processors based on Intel Core microarchitecture offera second-level cache shared by two processor cores in the same physical package.This provides opportunities for two application threads to access some applicationdata while minimizing the overhead of bus traffic.Multi-threaded applications may need to employ specialized programming models totake advantage of this type of hardware feature. One such scenario is referred to asproducer-consumer. In this scenario, one thread writes data into some destination(hopefully in the second-level cache) <strong>and</strong> another thread executing on the other corein the same physical package subsequently reads data produced by the first thread.The basic approach for implementing a producer-consumer model is to create twothreads; one thread is the producer <strong>and</strong> the other is the consumer. Typically, theproducer <strong>and</strong> consumer take turns to work on a buffer <strong>and</strong> inform each other whenthey are ready to exchange buffers. In a producer-consumer model, there is somethread synchronization overhead when buffers are exchanged between the producer<strong>and</strong> consumer. To achieve optimal scaling with the number of cores, the synchronizationoverhead must be kept low. This can be done by ensuring the producer <strong>and</strong>consumer threads have comparable time constants for completing each incrementaltask prior to exchanging buffers.Example 8-1 illustrates the coding structure of single-threaded execution of asequence of task units, where each task unit (either the producer or consumer)executes serially (shown in Figure 8-2). In the equivalent scenario under multithreadedexecution, each producer-consumer pair is wrapped as a thread function<strong>and</strong> two threads can be scheduled on available processor resources simultaneously.Example 8-1. Serial Execution of Producer <strong>and</strong> Consumer Work Itemsfor (i = 0; i < number_of_iterations; i++) {producer (i, buff); // pass buffer index <strong>and</strong> buffer addressconsumer (i, buff);}(MainThreadP(1)C(1)P(1)C(1)P(1)Figure 8-2. Single-threaded Execution of Producer-consumer Threading Model8-6

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!