Magellan Final Report - Office of Science - U.S. Department of Energy
Magellan Final Report - Office of Science - U.S. Department of Energy
Magellan Final Report - Office of Science - U.S. Department of Energy
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Chapter 4<br />
Application Characteristics<br />
A challenge with exploiting cloud computing for science is that the scientific community has needs that<br />
can be significantly different from typical enterprise customers. Applications <strong>of</strong>ten run in a tightly coupled<br />
manner at scales greater than most enterprise applications require. This <strong>of</strong>ten leads to bandwidth and<br />
latency requirements that are more demanding than most cloud customers. Scientific applications also<br />
typically require access to large amounts <strong>of</strong> data including large volumes <strong>of</strong> legacy data. This can lead to<br />
large startup cost and data storage cost. The goal <strong>of</strong> <strong>Magellan</strong> has been to analyze these requirements<br />
using an advanced testbed, a flexible s<strong>of</strong>tware stack, and performing a range <strong>of</strong> data gathering efforts. In<br />
this chapter we summarize some <strong>of</strong> the key application characteristics that are necessary to understand the<br />
feasibility <strong>of</strong> clouds for these scientific applications. In Section 4.1 we summarize the computational models,<br />
and in Section 4.2 we discuss some diverse usage scenarios. We summarize the results <strong>of</strong> the user survey in<br />
Section 4.3 and discuss some scientific use cases in Section 4.4.<br />
4.1 Computational Models<br />
Scientific workloads can be classified into three broad categories based on their resource requirements: largescale<br />
tightly coupled computations, mid-range computing, and high throughput computing. In this section,<br />
we provide a high-level classification <strong>of</strong> workloads in the scientific space, based on their resource requirements<br />
and delve into the details <strong>of</strong> why cloud computing is attractive to these application spaces.<br />
Large-Scale Tightly Coupled. These are complex scientific codes generally running at large-scale supercomputing<br />
centers across the nation. Typically, these are MPI codes using a large number <strong>of</strong> processors<br />
(<strong>of</strong>ten in the order <strong>of</strong> thousands). A single job can use thousands to millions <strong>of</strong> core hours depending on<br />
the scale. These jobs are usually run at supercomputing centers through batch queue systems. Users wait<br />
in a managed queue to access the resources requested, and their jobs are run when the required resources<br />
are available and no other jobs are ahead <strong>of</strong> them in the priority list. Most supercomputing centers provide<br />
archival storage and parallel file system access for the storage and I/O needs <strong>of</strong> these applications. Applications<br />
in this class are expected to take a performance hit when working in virtualized cloud environments [46].<br />
Mid-Range Tightly Coupled. These applications run at a smaller scale than the large-scale jobs. There<br />
are a number <strong>of</strong> codes that need tens to hundreds <strong>of</strong> processors. Some <strong>of</strong> these applications run at supercomputing<br />
centers and backfill the queues. More commonly, users rely on small compute clusters that are<br />
managed by the scientific groups themselves to satisfy these needs. These mid-range applications are good<br />
candidates for cloud computing even though they might incur some performance hit.<br />
High Throughput. Some scientific explorations are performed on the desktop or local clusters and have<br />
asynchronous, independent computations. Even in the case <strong>of</strong> large-scale science problems, a number <strong>of</strong> the<br />
15