Hadoop: The Definitive Guide - Cdn.oreilly.com
Hadoop: The Definitive Guide - Cdn.oreilly.com
Hadoop: The Definitive Guide - Cdn.oreilly.com
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
15. Sqoop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527<br />
Getting Sqoop 527<br />
Sqoop Connectors 529<br />
A Sample Import 529<br />
Text and Binary File Formats 532<br />
Generated Code 532<br />
Additional Serialization Systems 533<br />
Imports: A Deeper Look 533<br />
Controlling the Import 535<br />
Imports and Consistency 536<br />
Direct-mode Imports 536<br />
Working with Imported Data 536<br />
Imported Data and Hive 537<br />
Importing Large Objects 540<br />
Performing an Export 542<br />
Exports: A Deeper Look 543<br />
Exports and Transactionality 545<br />
Exports and SequenceFiles 545<br />
16. Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547<br />
<strong>Hadoop</strong> Usage at Last.fm 547<br />
Last.fm: <strong>The</strong> Social Music Revolution 547<br />
<strong>Hadoop</strong> at Last.fm 547<br />
Generating Charts with <strong>Hadoop</strong> 548<br />
<strong>The</strong> Track Statistics Program 549<br />
Summary 556<br />
<strong>Hadoop</strong> and Hive at Facebook 556<br />
<strong>Hadoop</strong> at Facebook 556<br />
Hypothetical Use Case Studies 559<br />
Hive 562<br />
Problems and Future Work 566<br />
Nutch Search Engine 567<br />
Data Structures 568<br />
Selected Examples of <strong>Hadoop</strong> Data Processing in Nutch 571<br />
Summary 580<br />
Log Processing at Rackspace 581<br />
Requirements/<strong>The</strong> Problem 581<br />
Brief History 582<br />
Choosing <strong>Hadoop</strong> 582<br />
Collection and Storage 582<br />
MapReduce for Logs 583<br />
Cascading 589<br />
Fields, Tuples, and Pipes 590<br />
Table of Contents | xiii