29.12.2014 Views

Magellan Final Report - Office of Science - U.S. Department of Energy

Magellan Final Report - Office of Science - U.S. Department of Energy

Magellan Final Report - Office of Science - U.S. Department of Energy

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Magellan</strong> <strong>Final</strong> <strong>Report</strong><br />

Table 10.1: Comparison <strong>of</strong> HDFS with a parallel filesystems (GPFS and Lustre) .<br />

HDFS<br />

GPFS and Lustre<br />

Storage Location Compute Node Servers<br />

Access Model Custom (except with Fuse) POSIX<br />

Typical Replication 3 1<br />

Typical Stripe Size 64 MB 1 MB<br />

Concurrent Writes No Yes<br />

Performance Scales with No. <strong>of</strong> Compute Nodes No. <strong>of</strong> Servers<br />

Scale <strong>of</strong> Largest Systems O(10k) Nodes O(100) Servers<br />

User/Kernel Space User Kernel<br />

10.1 MapReduce<br />

MapReduce is a programming model that enables users to achieve large-scale parallelism when processing<br />

and generating large data sets. The MapReduce model has been used to process large amounts <strong>of</strong> index,<br />

log and user behavior data in global internet companies including Google, Yahoo!, Facebook and Twitter.<br />

The MapReduce model was designed to be implemented and deployed on very large clusters <strong>of</strong> commodity<br />

machines. It was originally developed, and is still heavily used for massive data analysis tasks. MapReduce<br />

jobs were proposed to run on the proprietary Google File System (GFS) distributed file system [33]. GFS<br />

supports several features that are specifically suited to MapReduce workloads, such as data locality and data<br />

replication.<br />

The basic steps in a MapReduce programming model consists <strong>of</strong> a map function that transforms the<br />

input records into intermediate output data, grouped by a user-specified key. The reduce function processes<br />

intermediate data to generate a final result. Both the map and reduce functions consume and generate<br />

key/value pairs. The reduce phase gets a sorted list <strong>of</strong> key, value pairs from the MapReduce framework. In<br />

the MapReduce model all parallel instances <strong>of</strong> mappers or reducers are the same program; any differentiation<br />

in their behavior must be based solely on the inputs they operate on.<br />

The canonical basic example <strong>of</strong> MapReduce is a word-count algorithm where the map emits the word<br />

itself as the key and the number 1 as the value; and the reduce sums the values for each key, thus counting<br />

the total number <strong>of</strong> occurrences for each word.<br />

10.2 Hadoop<br />

Hadoop is an open-source distributed computing platform that implements the MapReduce model. Hadoop<br />

consists <strong>of</strong> two core components: the job management framework that handles the map and reduce tasks and<br />

the Hadoop Distributed File System (HDFS) [76]. Hadoop’s job management framework is highly reliable<br />

and available, using techniques such as replication and automated restart <strong>of</strong> failed tasks. The framework has<br />

optimization for heterogeneous environments and workloads, e.g., speculative (redundant) execution that<br />

reduces delays due to stragglers.<br />

HDFS is a highly scalable, fault-tolerant file system modeled after the Google File System. The data<br />

locality features <strong>of</strong> HDFS are used by the Hadoop scheduler to schedule the I/O intensive map computations<br />

closer to the data. The scalability and reliability characteristics <strong>of</strong> Hadoop suggest that it could be used<br />

as an engine for running scientific ensembles. However, the target application for Hadoop is very loosely<br />

coupled analysis <strong>of</strong> huge data sets. Table 10.1 shows a comparison <strong>of</strong> HDFS with parallel file systems such<br />

as GPFS and Lustre. HDFS fundamentally differs from parallel file systems due to the underlying storage<br />

model. HDFS relies on local storage on each node while parallel file systems are typically served from a set<br />

<strong>of</strong> dedicated I/O servers. Most parallel file systems use the POSIX interface that enables applications to be<br />

portable across different file systems on various HPC systems. HDFS on the other hand has its own interface<br />

requiring applications to be rewritten to leverage HDFS features. FUSE [31] provides a POSIX interface on<br />

top <strong>of</strong> HDFS but with a performance penalty.<br />

84

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!