Introduction and MapReduce - SNAP - Stanford University
Introduction and MapReduce - SNAP - Stanford University
Introduction and MapReduce - SNAP - Stanford University
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Reliable distributed file system<br />
Data kept in “chunks” spread across machines<br />
Each chunk replicated on different machines<br />
Seamless recovery from disk or machine failure<br />
C 0<br />
C 5<br />
C 1<br />
C 2<br />
Chunk server 1<br />
D 0<br />
C 5<br />
C 1<br />
C 3<br />
Chunk server 2<br />
C 2<br />
D 0<br />
C 5<br />
D 1<br />
Chunk server 3<br />
Bring computation directly to the data!<br />
…<br />
C 0<br />
D 0<br />
C 5<br />
C 2<br />
Chunk server N<br />
1/8/2012 Jure Leskovec, <strong>Stanford</strong> CS246: Mining Massive Datasets, http://cs246.stanford.edu 31