Hadoop Development - CSC
Hadoop Development - CSC
Hadoop Development - CSC
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
What Is <strong>Hadoop</strong>? (cont’d)<br />
• <strong>Hadoop</strong> can scale<br />
– Yahoo! has many clusters, the largest is 4000 nodes providing 16PB of HDFS (4 x 1TB HDDs/server)<br />
– Facebook has a 2000 node cluster providing 21PB of HDFS (12 x 1TB HDDs/server)<br />
• July 2011 Facebook announced a 30PB <strong>Hadoop</strong> cluster in a new “bleeding edge” data centre<br />
• What <strong>Hadoop</strong>/HDFS is not<br />
– A Database - it does not require structured data<br />
– A POSIX file system<br />
– Real-time – batch only<br />
• <strong>Hadoop</strong> is map-reduce only<br />
– Not all problems necessarily lend themselves to a this type of solution<br />
TBSC 2009<br />
11/10/2011 12:53 PM 0725-23_TBSC 2009 11