Hadoop Development - CSC
Hadoop Development - CSC
Hadoop Development - CSC
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
What Is <strong>Hadoop</strong>?<br />
• Google famously encountered these problems when it wanted to index the<br />
World Wide Web. Their solution:<br />
– Google File System (GFS) for storage<br />
– Google Map-Reduce to be able to rapidly process, in a highly parallel way, data<br />
stored in GFS<br />
• Google‟s solution is proprietary and their success created demand for<br />
similar capabilities that other companies could use, including their<br />
competitors, most notably, Yahoo!<br />
• The result of that demand is an Apache Open Source project called<br />
<strong>Hadoop</strong> that provides:<br />
– HDFS (<strong>Hadoop</strong> Distributed File System), equivalent in capability to GFS<br />
– MapReduce to process the data stored in HDFS, equivalent in capability to Google<br />
Map-Reduce<br />
TBSC 2009<br />
11/10/2011 12:53 PM 0725-23_TBSC 2009 9