13.11.2012 Views

Hadoop Development - CSC

Hadoop Development - CSC

Hadoop Development - CSC

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

What Is <strong>Hadoop</strong>?<br />

• Google famously encountered these problems when it wanted to index the<br />

World Wide Web. Their solution:<br />

– Google File System (GFS) for storage<br />

– Google Map-Reduce to be able to rapidly process, in a highly parallel way, data<br />

stored in GFS<br />

• Google‟s solution is proprietary and their success created demand for<br />

similar capabilities that other companies could use, including their<br />

competitors, most notably, Yahoo!<br />

• The result of that demand is an Apache Open Source project called<br />

<strong>Hadoop</strong> that provides:<br />

– HDFS (<strong>Hadoop</strong> Distributed File System), equivalent in capability to GFS<br />

– MapReduce to process the data stored in HDFS, equivalent in capability to Google<br />

Map-Reduce<br />

TBSC 2009<br />

11/10/2011 12:53 PM 0725-23_TBSC 2009 9

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!