29.12.2014 Views

Magellan Final Report - Office of Science - U.S. Department of Energy

Magellan Final Report - Office of Science - U.S. Department of Energy

Magellan Final Report - Office of Science - U.S. Department of Energy

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Magellan</strong> <strong>Final</strong> <strong>Report</strong><br />

Table 12.3: Annual Cost <strong>of</strong> the DOE NERSC HPC Center in the cloud<br />

NERSC<br />

ALCF<br />

Cost in Cloud per Year<br />

Cost in Cloud per Year<br />

· Hopper (152,496 cores) $148,993,548 · Intrepid (163,840 cores) $160,077,004<br />

· Franklin (38,320 cores) $37,439,885 · Surveyor (4096 cores) $4,001,925<br />

· Carver (3,200 cores) $3,126,504 · Challenger (4096 cores) $4,001,925<br />

· HPSS (17 PB, 9 PB transferred)<br />

$8,189,952 · HPSS (16 PB, 9PB transferred) $1,062,297<br />

· File Systems (2 PB Total) $2,642,412 · File Systems (8 PB Total) $10,066,329<br />

Total Cost in Cloud $200,392,301 Total Cost in Cloud $179,209,482<br />

NERSC Total Annual Budget $55,000,000 ALCF Total Annual Budget $41,000,000<br />

(including support and other<br />

(including support and other<br />

services)<br />

services)<br />

NERSC Annual Budget $33,000,000 ALCF Annual Budget $30,750,000<br />

for systems and related costs<br />

only<br />

for systems and related costs<br />

only<br />

estimate. In addition, the budgets for both centers include additional services and support which would not<br />

be captured in the cloud cost. Our users have indicated the need for these services for cloud solutions as<br />

well (Chapter 11). These additional services account for approximately 40% <strong>of</strong> NERSC’s annual budget and<br />

25% <strong>of</strong> ALCFs budget (calculated based on the staff at the centers who provide these additional services<br />

and support). Adjusting for these costs results in a comparison <strong>of</strong> approximately $200M for the commercial<br />

cloud versus $33M for the actual costs <strong>of</strong> NERSC to DOE, and a comparison <strong>of</strong> $180M for the commercial<br />

cloud versus $31M for the actual costs <strong>of</strong> ALCF to DOE.<br />

12.2.4 Application Workload Cost and HPL Analysis<br />

Another approach is to consider the cost <strong>of</strong> running a given workload in a commercial cloud and multiplying<br />

that out for the anticipated amount that the workload would be performed over a period <strong>of</strong> time. Chapter 9<br />

showed the costs for bioinformatics workloads and STAR workloads. This approach works best when the<br />

workload can be well encapsulated and doesn’t have extraneous dependencies on large data sets. For example,<br />

performing functional analysis on genomic data sets works well since the input data can be easily quantified<br />

and the run time for a given dataset can be easily measured.<br />

Here we use the Linpack benchmark as a stand-in for the various applications that run at NERSC<br />

and ALCF. One reason for choosing this benchmark is it is a recognized benchmark that is published<br />

for many systems and removes many <strong>of</strong> the complexities <strong>of</strong> making a comparison across different system<br />

architectures. Also, it was chosen because it gives some favoritism to the Cloud systems since Linpack has<br />

high computational intensity and makes modest use <strong>of</strong> the interconnect compared with many real-world<br />

scientific applications using MPI (Section 9.1.5). For the comparison, we calculate the cost <strong>of</strong> a Teraflop<br />

Year. This represents the cost to deliver a full teraflop for an entire year. Table 12.4 shows a comparison <strong>of</strong><br />

the cost <strong>of</strong> a Teraflop Year (based on Linpack) from Amazon, NERSC, and ALCF. All <strong>of</strong> the HPL results<br />

come from the published values listed on the TOP500 [80] website from the June 2011 list. The Amazon<br />

costs are based on 1-year reserved Cluster Compute Instances that are assumed to be fully utilized during<br />

the year (as described earlier). The NERSC and ALCF costs are based on each center’s total annual budget<br />

which includes significantly more than just computation as noted in the section above. The comparison<br />

illustrates that for tightly coupled applications (even ones with high computational intensity like Linpack),<br />

the cost <strong>of</strong> a commercial cloud is historically much higher (over a factor <strong>of</strong> 4). This factor is lower compared<br />

with some <strong>of</strong> the other approaches, but, again these calculations use the entire center budgets which include<br />

122

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!