29.12.2014 Views

Magellan Final Report - Office of Science - U.S. Department of Energy

Magellan Final Report - Office of Science - U.S. Department of Energy

Magellan Final Report - Office of Science - U.S. Department of Energy

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Magellan</strong> <strong>Final</strong> <strong>Report</strong><br />

commercial clouds. The second essentially compares the costs <strong>of</strong> an entire center. The final approach takes<br />

an application centric approach.<br />

12.2.1 Assumptions and Inputs<br />

In all <strong>of</strong> the cost analyses, we have attempted to use the most cost effective option available. For example,<br />

based on our benchmarking analysis, the Cluster Compute <strong>of</strong>fering is the most cost effective option from<br />

Amazon for tightly coupled MPI applications and even most CPU intensive applications, since the instances<br />

are dedicated resulting in less interference from other applications. Furthermore, most <strong>of</strong> the instance<br />

pricing works out to a roughly constant cost per core hour. For example, a Cluster Compute Instance is<br />

approximately 16x more capable than a regular small instance and the cost is approximately 16x more. So<br />

using smaller, less expensive instances isn’t more cost effective if the application can effectively utilize all <strong>of</strong><br />

the cores, which is true <strong>of</strong> most CPU intensive scientific applications. In contrast, many web applications<br />

under utilize the CPU, making small instances more cost effective for those use cases. For compute instances,<br />

we use a one year reserved instance and assume the nodes are fully utilized over the entire year to compute<br />

an effective core hour cost. With reserved instances, you pay a fixed up-front cost in order to pay a lower per<br />

hour cost. If the instance is used for a majority <strong>of</strong> the reserved period (one year in our analysis), this results<br />

in a lower effective rate. For example, an on-demand Cluster Compute instance costs $1.60 per hour, but a<br />

reserved instance that is used during the entire one year period results in an effective rate <strong>of</strong> $1.05 (a 30%<br />

reduction). We further divide this by the number <strong>of</strong> cores in the instance to arrive at an effective core-hour<br />

cost, which simplifies comparisons with other systems. Table 12.1 summarizes this calculation. It is worth<br />

noting that the lowest spot instance pricing is approximately 50% <strong>of</strong> this effective core hour cost. We do not<br />

use this <strong>of</strong>fering as a basis for the cost analysis, since the runtime for a spot instance is unpredictable and<br />

application programmers need to design their applications to handle pre-emption, which would not match<br />

the requirements for our applications. However, spot pricing does provide an estimate <strong>of</strong> the absolute lowest<br />

bound for pricing, since it essentially reflects the price threshold at which Amazon is unwilling to <strong>of</strong>fer a<br />

service.<br />

For file system costs, we use elastic block storage to compute the storage costs for file systems. This<br />

most likely underestimates the costs since it omits the costs for I/O requests and the costs for instances that<br />

would be required to act as file system servers. S3 is used to compute the costs for archival storage. S3<br />

uses a tiered cost system where the incremental storage costs decline as more data is stored in the system.<br />

For example, the monthly cost to store the first terabyte <strong>of</strong> data using reduced redundancy is $0.093 per<br />

gigabyte, while the monthly cost to store data between 1 TB and 49 TB is $0.083 per GB. For simplicity,<br />

we compute all S3 costs at the lowest rate. For example, since the NERSC archival system has 19 PB <strong>of</strong><br />

data stored, we use Amazon’s rate <strong>of</strong> $0.037 per GB (for a month) for data stored above 5 PB with reduced<br />

redundancy. The cost for transactions is also omitted for simplicity, but would further increase the cost <strong>of</strong><br />

using the commercial <strong>of</strong>fering. The pricing data was collected from the Amazon website on September 30,<br />

2011.<br />

12.2.2 Computed Hourly Cost <strong>of</strong> an HPC System<br />

One <strong>of</strong> the more direct methods to compare the cost <strong>of</strong> DOE HPC System with cloud <strong>of</strong>ferings is to compute<br />

the effective hourly cost per core hour. This makes it relatively straight forward to compare it with similar<br />

commercial cloud systems. However, determining this cost is problematic, since many <strong>of</strong> the costs used for the<br />

calculation are indirect or business sensitive. However, for the sake <strong>of</strong> comparison we will use Hopper, a Cray<br />

XE-6 system recently deployed at NERSC. This system was selected for comparison because it is a relatively<br />

recent deployment, is large enough to capture economy <strong>of</strong> scale, and is tuned for scientific applications<br />

relevant to the DOE-SC community. In lieu <strong>of</strong> providing detailed costs that may be business sensitive, we<br />

will use conservative values for the cost which are higher than actual costs. The Hopper contract has been<br />

valued at approximately $52M. We use the peak power <strong>of</strong> 3 MW <strong>of</strong> power (it typically uses around 2 MW)<br />

which translates into an power cost <strong>of</strong> $2.6M per year assuming a cost <strong>of</strong> $0.10 per KWHour. In general, $0.10<br />

119

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!