29.12.2014 Views

Magellan Final Report - Office of Science - U.S. Department of Energy

Magellan Final Report - Office of Science - U.S. Department of Energy

Magellan Final Report - Office of Science - U.S. Department of Energy

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Magellan</strong> <strong>Final</strong> <strong>Report</strong><br />

MB/s<br />

Virident tachIOn (400GB)<br />

Fusion IO ioDrive Duo (Single Slot, 160GB)<br />

OCZ Colossus (250 GB)<br />

1200<br />

1000<br />

800<br />

600<br />

400<br />

200<br />

TMS RamSan 20 (450GB)<br />

Intel X-25M (160GB)<br />

Percentage <strong>of</strong> Peak Write Bandwidth<br />

90%<br />

80%<br />

70%<br />

60%<br />

50%<br />

40%<br />

30%<br />

20%<br />

10%<br />

30% Capacity 50% Capacity 70% Capacity 90% Capacity<br />

0<br />

0 10 20 30 40 50<br />

Minutes<br />

0%<br />

Virident tachIOn<br />

(400GB)<br />

TMS RamSan 20<br />

(450GB)<br />

Fusion IO ioDrive<br />

Duo (Single Slot,<br />

160GB)<br />

Intel X-25M<br />

(160GB)<br />

OCZ Colossus (250<br />

GB)<br />

(a) Transient Random-Write Bandwidth Degradation (90% Capacity)<br />

(b) Steady-State Random-Write Bandwidth Degradation<br />

Figure 9.22: Graphs illustrating the degradation in the IO bandwidth to various flash devices under a<br />

sustained random write workload. The graph on the left shows the transient behavior while the graph on<br />

the right compares the steady-state performance <strong>of</strong> the devices while varying the utilized capacity.<br />

9.5 Applications<br />

We worked closely with our users to help them port their applications to cloud environments, to compare<br />

application performance with existing environments (where applicable), and to compare and contrast performance<br />

with other applications. In this section, we outline select benchmarking results from our applications.<br />

9.5.1 SPRUCE<br />

As mentioned in Chapter 3, the <strong>Magellan</strong> staff collaborated with the SPRUCE team to understand the<br />

implications <strong>of</strong> using cloud resources for urgent computing. In this context, some benchmarking was performed<br />

to measure the allocation delay (i.e., the amount <strong>of</strong> time between a request for some number <strong>of</strong><br />

instances and the time when all requested instances are available) as the size <strong>of</strong> the request increases. These<br />

benchmarking experiments were conducted on three separate cloud s<strong>of</strong>tware stacks on the ALCF <strong>Magellan</strong><br />

hardware: Eucalyptus 1.6.2, Eucalyptus 2.0 and OpenStack. The results <strong>of</strong> these benchmarks revealed some<br />

unexpected performance behaviors. First, Eucalyptus version 1.6.2 <strong>of</strong>fered very poor performance. The<br />

allocation delay linearly increased as the size <strong>of</strong> the request increased. This was unexpected given that in<br />

the benchmark experiments, the image being created was pre-cached across all the nodes <strong>of</strong> the cloud, so<br />

one would expect that the allocation delays would have been much more stable, given that the vast majority<br />

<strong>of</strong> the work is being done by the nodes and not the centralized cloud components. Also, as the number<br />

<strong>of</strong> requested instances increased, the stability <strong>of</strong> the cloud decreased. Instances were more likely to fail to<br />

reach a running state and the cloud also required a resting period in between trials in order to recover. For<br />

example, for the 128-instance trials, the cloud needed to rest for 150 minutes in between trials, or else all <strong>of</strong><br />

the instances would fail to start. In Eucalyptus version 2.0, the performance and stability issues appeared<br />

to be resolved. The allocation delays were much flatter (as one would expect), and the resting periods were<br />

no longer required. OpenStack, in comparison, <strong>of</strong>fered shorter allocation delays, the result <strong>of</strong> an allocation<br />

process which utilized copy-on-write and sparse files for the disk images. As the images were pre-cached<br />

on the nodes, this resulted in significantly shorter allocation delays, particularly for smaller request sizes.<br />

However, the plot <strong>of</strong> the allocation delay was again not as flat as might be expected (see Figure 9.23).<br />

75

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!