Magellan Final Report - Office of Science - U.S. Department of Energy
Magellan Final Report - Office of Science - U.S. Department of Energy
Magellan Final Report - Office of Science - U.S. Department of Energy
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Magellan</strong> <strong>Final</strong> <strong>Report</strong><br />
100 <br />
Percentage Performance Rela/ve to Na/ve <br />
90 <br />
80 <br />
70 <br />
60 <br />
50 <br />
40 <br />
30 <br />
20 <br />
10 <br />
TCP o IB <br />
TCP o Eth <br />
VM <br />
0 <br />
GTC PARATEC CAM <br />
Figure 9.4: Relative performance <strong>of</strong> each <strong>of</strong> the cases examined for the three applications: GTC, PARATEC<br />
and CAM.<br />
100 <br />
100 <br />
Percentage <strong>of</strong> Computa0on Time Rela0ve to Na0ve <br />
90 <br />
80 <br />
70 <br />
60 <br />
50 <br />
40 <br />
30 <br />
20 <br />
10 <br />
TCP o IB <br />
TCP o Eth <br />
VM <br />
Percentage <strong>of</strong> Communcia0on Time rela0ve to Na0ve <br />
90 <br />
80 <br />
70 <br />
60 <br />
50 <br />
40 <br />
30 <br />
20 <br />
10 <br />
TCP o IB <br />
TCP o Eth <br />
VM <br />
0 <br />
GTC PARATEC CAM <br />
0 <br />
GTC PARATEC CAM <br />
Figure 9.5: Performance ratio for the compute (a) and communication (b) parts <strong>of</strong> the runtime for each <strong>of</strong><br />
the three applications GTC, PARATEC and CAM for each <strong>of</strong> the cases under consideration.<br />
urations are shown in Figure 9.4. For all the codes, except CAM using TCP over Ethernet, the performance<br />
shows the expected trend, with each successive configuration showing decreased performance. To understand<br />
the reasons for the differing performance impact on each <strong>of</strong> the three codes, we analyzed them using the<br />
Integrated Performance Monitoring framework [77, 84].<br />
The principal performance measurement made by IPM is the percentage <strong>of</strong> the runtime spent communicating<br />
(the time in the MPI library), which also allows us to infer the time spent computing. The results<br />
<strong>of</strong> this analysis are shown in Figure 9.5 which shows the compute time ratios for the various configurations.<br />
As expected from the HPCC results, only the VM configuration shows significant performance degradation<br />
for the compute portion <strong>of</strong> the runtime. Why the different applications show different slowdowns is unclear<br />
at this point, but we suspect it is related to the mix <strong>of</strong> instructions that they perform and the different<br />
slowdowns each experiences in the VM. The communication ratios are shown in Figure 9.5b. In this case<br />
the differences between the different applications and between the various configurations are much more<br />
pronounced. As well as providing overall information about the amount <strong>of</strong> time spent in the MPI library,<br />
IPM also reports the type <strong>of</strong> MPI routine called, the size <strong>of</strong> message sent, and the destination <strong>of</strong> the message.<br />
Using this information, we can understand why the different applications show different slowdowns. In the<br />
54