19.08.2013 Views

Introduction and MapReduce - SNAP - Stanford University

Introduction and MapReduce - SNAP - Stanford University

Introduction and MapReduce - SNAP - Stanford University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Large-scale computing for data mining<br />

problems on commodity hardware<br />

Challenges:<br />

How do you distribute computation?<br />

How can we make it easy to write distributed<br />

programs?<br />

Machines fail:<br />

One server may stay up 3 years (1,000 days)<br />

If you have 1,0000 servers, expect to loose 1/day<br />

In Aug 2006 Google had ~450,000 machines<br />

1/8/2012 Jure Leskovec, <strong>Stanford</strong> CS246: Mining Massive Datasets, http://cs246.stanford.edu 27

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!