13.07.2015 Views

Intel(R) - Computational and Systems Biology at MIT

Intel(R) - Computational and Systems Biology at MIT

Intel(R) - Computational and Systems Biology at MIT

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

6 <strong>Intel</strong>® M<strong>at</strong>h Kernel Library User’s GuideApplic<strong>at</strong>ions based on IA-32 or <strong>Intel</strong>® 64 architecture. The addresses of the first elements ofarrays <strong>and</strong> the leading dimension values, in bytes (n*element_size), of two-dimensionalarrays should be divisible by cache line size, which equals• 32 bytes for Pentium® III processor• 64 bytes for Pentium® 4 processor• 128 bytes for processor using <strong>Intel</strong>® 64 architecture.Applic<strong>at</strong>ions based on IA-64 architecture. The sufficient conditions are as follows:• For the C-style FFT, the distance L between arrays th<strong>at</strong> represent real <strong>and</strong> imaginaryparts is not divisible by 64. The best case is when L=k*64 + 16• Leading dimension values, in bytes (n*element_size), of two-dimensional arrays arenot power of two.Hardware Configur<strong>at</strong>ion TipsDual-Core <strong>Intel</strong>® Xeon® processor 5100 series systems. To get the best <strong>Intel</strong> MKLperformance on Dual-Core <strong>Intel</strong>® Xeon® processor 5100 series systems, you are advisedto enable the Hardware DPL (streaming d<strong>at</strong>a) Prefetcher functionality of this processor.Configur<strong>at</strong>ion of this functionality is accomplished through appropri<strong>at</strong>e BIOS settings wheresupported. Check your BIOS document<strong>at</strong>ion for details.The use of Hyper-Threading Technology. Hyper-Threading Technology (HT Technology) isespecially effective when each thread is performing different types of oper<strong>at</strong>ions <strong>and</strong> whenthere are under-utilized resources on the processor. <strong>Intel</strong> MKL fits neither of these criteriaas the threaded portions of the library execute <strong>at</strong> high efficiencies using most of theavailable resources <strong>and</strong> perform identical oper<strong>at</strong>ions on each thread. You may obtainhigher performance when using <strong>Intel</strong> MKL without HT Technology enabled. See Using<strong>Intel</strong>® MKL Parallelism for inform<strong>at</strong>ion on the default number of threads, changing thisnumber, <strong>and</strong> other relevant details.If you run with HT enabled, performance may be especially impacted if you run on fewerthreads than physical cores. For example, as there are two threads to every physical core,the thread scheduler may assign two threads to some cores <strong>and</strong> ignore others altogether. Ifyou are using the OpenMP* library of the <strong>Intel</strong> Compiler, read the respective User Guide onhow to best set the affinity to avoid this situ<strong>at</strong>ion. For <strong>Intel</strong> MKL, you are recommended toset KMP_AFFINITY=granularity=fine,compact,1,0.6-14

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!