29.12.2014 Views

Magellan Final Report - Office of Science - U.S. Department of Energy

Magellan Final Report - Office of Science - U.S. Department of Energy

Magellan Final Report - Office of Science - U.S. Department of Energy

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 7<br />

User Support<br />

Many <strong>of</strong> the aspects <strong>of</strong> cloud computing that make it so powerful also introduce new complexities and<br />

challenges for both users and user support staff. Cloud computing provides users the flexibility to customize<br />

their s<strong>of</strong>tware stack, but it comes with the additional burden <strong>of</strong> managing the stack. Commercial cloud<br />

providers have a limited user support model and typically additional support comes at an extra cost. This<br />

chapter describes the user support model that was used for the <strong>Magellan</strong> project, including some <strong>of</strong> the<br />

challenges that emerged during the course <strong>of</strong> the project. We discuss the key aspects <strong>of</strong> cloud computing<br />

architecture that have bearing on user support. We discuss several examples <strong>of</strong> usage patterns <strong>of</strong> users and<br />

how these were addressed. <strong>Final</strong>ly, we summarize the overall assessment <strong>of</strong> the user support process for<br />

mid-range computing users on cloud platforms.<br />

7.1 Comparison <strong>of</strong> User Support Models<br />

HPC centers provide a well-curated environment for robust, high-performance computing, which they make<br />

accessible to non-expert users through a variety <strong>of</strong> activities. In these environments, substantial effort is put<br />

into helping users to be productive and successful on the hosted platform. These efforts take a number <strong>of</strong><br />

forms, from building a tuned s<strong>of</strong>tware environment that is optimized for HPC workloads, to user education,<br />

and application porting and optimization. These efforts are important to the success <strong>of</strong> current and new<br />

users in HPC facilities, as many computational scientists are not necessarily deeply knowledgeable in terms<br />

<strong>of</strong> the details <strong>of</strong> modern computing hardware and s<strong>of</strong>tware architecture.<br />

HPC centers typically provide a single system s<strong>of</strong>tware stack, paired with purpose built hardware, and a<br />

set <strong>of</strong> policies for user access and prioritization. Users rely on a relatively fixed set <strong>of</strong> interfaces for interaction<br />

with the resource manager, file system, and other facility services. Many HPC use cases are well covered<br />

within this scope; for example, this environment is adapted for MPI applications that perform I/O to a<br />

parallel file system. Other use cases such as high-throughput computing and data-intensive computing, may<br />

not be so well supported at HPC centers. For example, computer scientists developing low level runtime<br />

s<strong>of</strong>tware for HPC applications have a particularly difficult time performing this work at production computing<br />

centers. Also, deploying Hadoop on demand for computations, could be performed within the framework <strong>of</strong><br />

a traditional HPC system, albeit with significant effort and in a less optimized fashion.<br />

Cloud systems provide Application Programming Interfaces (API) for low level resource provisioning.<br />

These APIs enable users to provision new virtual machines, storage, and network resources. These resources<br />

are configured by the user and can be built into complex networks including dozens, hundreds, or potentially<br />

even thousands <strong>of</strong> VMs with distinct s<strong>of</strong>tware configurations, security policies, and service architectures. The<br />

flexibility <strong>of</strong> the capabilities provided by cloud APIs is substantial, allowing users to manage clusters built<br />

out <strong>of</strong> virtual machines hosted inside <strong>of</strong> a cloud. This power comes at some cost in terms <strong>of</strong> support. The<br />

cloud model <strong>of</strong>fers a large amount <strong>of</strong> flexibility, making it difficult and expensive to provide support to cloud<br />

users. The opportunities for errors or mistakes greatly increase once a user begins to modify virtual machine<br />

36

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!