19.08.2013 Views

Introduction and MapReduce - SNAP - Stanford University

Introduction and MapReduce - SNAP - Stanford University

Introduction and MapReduce - SNAP - Stanford University

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Problem<br />

If nodes fail, how to store data persistently?<br />

Answer<br />

Distributed File System:<br />

Provides global file namespace<br />

Google GFS; Hadoop HDFS;<br />

Typical usage pattern<br />

Huge files (100s of GB to TB)<br />

Data is rarely updated in place<br />

Reads <strong>and</strong> appends are common<br />

1/8/2012 Jure Leskovec, <strong>Stanford</strong> CS246: Mining Massive Datasets, http://cs246.stanford.edu 29

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!