Magellan Final Report - Office of Science - U.S. Department of Energy
Magellan Final Report - Office of Science - U.S. Department of Energy
Magellan Final Report - Office of Science - U.S. Department of Energy
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Magellan</strong> <strong>Final</strong> <strong>Report</strong><br />
commercial clouds. The second essentially compares the costs <strong>of</strong> an entire center. The final approach takes<br />
an application centric approach.<br />
12.2.1 Assumptions and Inputs<br />
In all <strong>of</strong> the cost analyses, we have attempted to use the most cost effective option available. For example,<br />
based on our benchmarking analysis, the Cluster Compute <strong>of</strong>fering is the most cost effective option from<br />
Amazon for tightly coupled MPI applications and even most CPU intensive applications, since the instances<br />
are dedicated resulting in less interference from other applications. Furthermore, most <strong>of</strong> the instance<br />
pricing works out to a roughly constant cost per core hour. For example, a Cluster Compute Instance is<br />
approximately 16x more capable than a regular small instance and the cost is approximately 16x more. So<br />
using smaller, less expensive instances isn’t more cost effective if the application can effectively utilize all <strong>of</strong><br />
the cores, which is true <strong>of</strong> most CPU intensive scientific applications. In contrast, many web applications<br />
under utilize the CPU, making small instances more cost effective for those use cases. For compute instances,<br />
we use a one year reserved instance and assume the nodes are fully utilized over the entire year to compute<br />
an effective core hour cost. With reserved instances, you pay a fixed up-front cost in order to pay a lower per<br />
hour cost. If the instance is used for a majority <strong>of</strong> the reserved period (one year in our analysis), this results<br />
in a lower effective rate. For example, an on-demand Cluster Compute instance costs $1.60 per hour, but a<br />
reserved instance that is used during the entire one year period results in an effective rate <strong>of</strong> $1.05 (a 30%<br />
reduction). We further divide this by the number <strong>of</strong> cores in the instance to arrive at an effective core-hour<br />
cost, which simplifies comparisons with other systems. Table 12.1 summarizes this calculation. It is worth<br />
noting that the lowest spot instance pricing is approximately 50% <strong>of</strong> this effective core hour cost. We do not<br />
use this <strong>of</strong>fering as a basis for the cost analysis, since the runtime for a spot instance is unpredictable and<br />
application programmers need to design their applications to handle pre-emption, which would not match<br />
the requirements for our applications. However, spot pricing does provide an estimate <strong>of</strong> the absolute lowest<br />
bound for pricing, since it essentially reflects the price threshold at which Amazon is unwilling to <strong>of</strong>fer a<br />
service.<br />
For file system costs, we use elastic block storage to compute the storage costs for file systems. This<br />
most likely underestimates the costs since it omits the costs for I/O requests and the costs for instances that<br />
would be required to act as file system servers. S3 is used to compute the costs for archival storage. S3<br />
uses a tiered cost system where the incremental storage costs decline as more data is stored in the system.<br />
For example, the monthly cost to store the first terabyte <strong>of</strong> data using reduced redundancy is $0.093 per<br />
gigabyte, while the monthly cost to store data between 1 TB and 49 TB is $0.083 per GB. For simplicity,<br />
we compute all S3 costs at the lowest rate. For example, since the NERSC archival system has 19 PB <strong>of</strong><br />
data stored, we use Amazon’s rate <strong>of</strong> $0.037 per GB (for a month) for data stored above 5 PB with reduced<br />
redundancy. The cost for transactions is also omitted for simplicity, but would further increase the cost <strong>of</strong><br />
using the commercial <strong>of</strong>fering. The pricing data was collected from the Amazon website on September 30,<br />
2011.<br />
12.2.2 Computed Hourly Cost <strong>of</strong> an HPC System<br />
One <strong>of</strong> the more direct methods to compare the cost <strong>of</strong> DOE HPC System with cloud <strong>of</strong>ferings is to compute<br />
the effective hourly cost per core hour. This makes it relatively straight forward to compare it with similar<br />
commercial cloud systems. However, determining this cost is problematic, since many <strong>of</strong> the costs used for the<br />
calculation are indirect or business sensitive. However, for the sake <strong>of</strong> comparison we will use Hopper, a Cray<br />
XE-6 system recently deployed at NERSC. This system was selected for comparison because it is a relatively<br />
recent deployment, is large enough to capture economy <strong>of</strong> scale, and is tuned for scientific applications<br />
relevant to the DOE-SC community. In lieu <strong>of</strong> providing detailed costs that may be business sensitive, we<br />
will use conservative values for the cost which are higher than actual costs. The Hopper contract has been<br />
valued at approximately $52M. We use the peak power <strong>of</strong> 3 MW <strong>of</strong> power (it typically uses around 2 MW)<br />
which translates into an power cost <strong>of</strong> $2.6M per year assuming a cost <strong>of</strong> $0.10 per KWHour. In general, $0.10<br />
119