13.11.2012 Views

Hadoop Development - CSC

Hadoop Development - CSC

Hadoop Development - CSC

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

What Is <strong>Hadoop</strong>? (cont’d)<br />

• <strong>Hadoop</strong> can scale<br />

– Yahoo! has many clusters, the largest is 4000 nodes providing 16PB of HDFS (4 x 1TB HDDs/server)<br />

– Facebook has a 2000 node cluster providing 21PB of HDFS (12 x 1TB HDDs/server)<br />

• July 2011 Facebook announced a 30PB <strong>Hadoop</strong> cluster in a new “bleeding edge” data centre<br />

• What <strong>Hadoop</strong>/HDFS is not<br />

– A Database - it does not require structured data<br />

– A POSIX file system<br />

– Real-time – batch only<br />

• <strong>Hadoop</strong> is map-reduce only<br />

– Not all problems necessarily lend themselves to a this type of solution<br />

TBSC 2009<br />

11/10/2011 12:53 PM 0725-23_TBSC 2009 11

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!