19.08.2013 Views

Introduction and MapReduce - SNAP - Stanford University

Introduction and MapReduce - SNAP - Stanford University

Introduction and MapReduce - SNAP - Stanford University

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Statistical machine translation:<br />

Need to count number of times every 5-word<br />

sequence occurs in a large corpus of documents<br />

Very easy with <strong>MapReduce</strong>:<br />

Map:<br />

Extract (5-word sequence, count) from document<br />

Reduce:<br />

Combine counts<br />

1/8/2012 Jure Leskovec, <strong>Stanford</strong> CS246: Mining Massive Datasets, http://cs246.stanford.edu 54

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!