13.07.2015 Views

Intel(R) Math Kernel Library for Linux* OS User's Guide

Intel(R) Math Kernel Library for Linux* OS User's Guide

Intel(R) Math Kernel Library for Linux* OS User's Guide

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

11 <strong>Intel</strong>® <strong>Math</strong> <strong>Kernel</strong> <strong>Library</strong> User’s <strong>Guide</strong>mpiicc -O3 nodeperf.c -L$MKLPATH $MKLPATH/libmkl_intel_lp64.a \-Wl,--start-group $MKLPATH/libmkl_sequential.a \$MKLPATH/libmkl_core.a -Wl,--end-group –lpthread .Launching nodeperf.c on all the nodes is especially helpful in a very large cluster.nodeperf enables quick identification of the potential problem spot without numeroussmall MP LINPACK runs around the cluster in search of the bad node. It goes throughall the nodes, one at a time, and reports the per<strong>for</strong>mance of DGEMM followed by somehost identifier. There<strong>for</strong>e, the higher the DGEMM per<strong>for</strong>mance, the faster that nodewas per<strong>for</strong>ming.3. Edit HPL.dat to fit your cluster needs.Read through the HPL documentation <strong>for</strong> ideas on this. However, you should use atleast 4 nodes.4. Make an HPL run, using compile options such as ASYOUGO or ASYOUGO2 or ENDEARLY toaid in your search. These options enable you to gain insight into the per<strong>for</strong>mancesooner than HPL would normally give this insight.When doing so, follow these recommendations:— Use MP LINPACK, which is a patched version of HPL, to save time in the search.All per<strong>for</strong>mance intrusive features are compile-optional in MP LINPACK. That is, ifyou do not use the new options explained in section Options to Reduce SearchTime, these changes are disabled. The primary purpose of the additions is to assistyou in finding solutions.HPL requires a long time to search <strong>for</strong> many different parameters. In MP LINPACK,the goal is to get the best possible number.Given that the input is not fixed, there is a large parameter space you must searchover. An exhaustive search of all possible inputs is improbably large even <strong>for</strong> apowerful cluster. MP LINPACK optionally prints in<strong>for</strong>mation on per<strong>for</strong>mance as itproceeds. You can also terminate early.— Save time by compiling with -DENDEARLY -DASYOUGO2 (described in the Optionsto Reduce Search Time section) and using a negative threshold (do not use anegative threshold on the final run that you intend to submit as a Top500 entry).Set the threshold in line 13 of the HPL 2.0 input file HPL.dat.— If you are going to run a problem to completion, do it with -DASYOUGO (seeOptions to Reduce Search Time).5. Using the quick per<strong>for</strong>mance feedback, return to step 3 and iterate until you are surethat the per<strong>for</strong>mance is as good as possible.11-10

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!