04.07.2013 Views

Hadoop: The Definitive Guide - Cdn.oreilly.com

Hadoop: The Definitive Guide - Cdn.oreilly.com

Hadoop: The Definitive Guide - Cdn.oreilly.com

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

15. Sqoop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527<br />

Getting Sqoop 527<br />

Sqoop Connectors 529<br />

A Sample Import 529<br />

Text and Binary File Formats 532<br />

Generated Code 532<br />

Additional Serialization Systems 533<br />

Imports: A Deeper Look 533<br />

Controlling the Import 535<br />

Imports and Consistency 536<br />

Direct-mode Imports 536<br />

Working with Imported Data 536<br />

Imported Data and Hive 537<br />

Importing Large Objects 540<br />

Performing an Export 542<br />

Exports: A Deeper Look 543<br />

Exports and Transactionality 545<br />

Exports and SequenceFiles 545<br />

16. Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547<br />

<strong>Hadoop</strong> Usage at Last.fm 547<br />

Last.fm: <strong>The</strong> Social Music Revolution 547<br />

<strong>Hadoop</strong> at Last.fm 547<br />

Generating Charts with <strong>Hadoop</strong> 548<br />

<strong>The</strong> Track Statistics Program 549<br />

Summary 556<br />

<strong>Hadoop</strong> and Hive at Facebook 556<br />

<strong>Hadoop</strong> at Facebook 556<br />

Hypothetical Use Case Studies 559<br />

Hive 562<br />

Problems and Future Work 566<br />

Nutch Search Engine 567<br />

Data Structures 568<br />

Selected Examples of <strong>Hadoop</strong> Data Processing in Nutch 571<br />

Summary 580<br />

Log Processing at Rackspace 581<br />

Requirements/<strong>The</strong> Problem 581<br />

Brief History 582<br />

Choosing <strong>Hadoop</strong> 582<br />

Collection and Storage 582<br />

MapReduce for Logs 583<br />

Cascading 589<br />

Fields, Tuples, and Pipes 590<br />

Table of Contents | xiii

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!