Introduction and MapReduce - SNAP - Stanford University
Introduction and MapReduce - SNAP - Stanford University
Introduction and MapReduce - SNAP - Stanford University
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Problem<br />
If nodes fail, how to store data persistently?<br />
Answer<br />
Distributed File System:<br />
Provides global file namespace<br />
Google GFS; Hadoop HDFS;<br />
Typical usage pattern<br />
Huge files (100s of GB to TB)<br />
Data is rarely updated in place<br />
Reads <strong>and</strong> appends are common<br />
1/8/2012 Jure Leskovec, <strong>Stanford</strong> CS246: Mining Massive Datasets, http://cs246.stanford.edu 29