13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

MULTICORE AND HYPER-THREADING TECHNOLOGY8.2.1 Parallel Programming ModelsTwo common programming models for transforming independent task requirementsinto application threads are:• Domain decomposition• Functional decomposition8.2.1.1 Domain DecompositionUsually large compute-intensive tasks use data sets that can be divided into anumber of small subsets, each having a large degree of computational independence.Examples include:• Computation of a discrete cosine transformation (DCT) on two-dimensional databy dividing the two-dimensional data into several subsets <strong>and</strong> creating threads tocompute the transform on each subset• Matrix multiplication; here, threads can be created to h<strong>and</strong>le the multiplication ofhalf of matrix with the multiplier matrixDomain Decomposition is a programming model based on creating identical orsimilar threads to process smaller pieces of data independently. This model can takeadvantage of duplicated execution resources present in a traditional multiprocessorsystem. It can also take advantage of shared execution resources between twological processors in HT Technology. This is because a data domain thread typicallyconsumes only a fraction of the available on-chip execution resources.Section 8.3.5, “Key Practices of Execution Resource <strong>Optimization</strong>,” discusses additionalguidelines that can help data domain threads use shared execution resourcescooperatively <strong>and</strong> avoid the pitfalls creating contentions of hardware resourcesbetween two threads.8.2.2 Functional DecompositionApplications usually process a wide variety of tasks with diverse functions <strong>and</strong> manyunrelated data sets. For example, a video codec needs several different processingfunctions. These include DCT, motion estimation <strong>and</strong> color conversion. Using a functionalthreading model, applications can program separate threads to do motion estimation,color conversion, <strong>and</strong> other functional tasks.Functional decomposition will achieve more flexible thread-level parallelism if it isless dependent on the duplication of hardware resources. For example, a threadexecuting a sorting algorithm <strong>and</strong> a thread executing a matrix multiplication routineare not likely to require the same execution unit at the same time. A design recognizingthis could advantage of traditional multiprocessor systems as well as multiprocessorsystems using processors supporting HT Technology.8-5

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!