04.07.2013 Views

Hadoop: The Definitive Guide - Cdn.oreilly.com

Hadoop: The Definitive Guide - Cdn.oreilly.com

Hadoop: The Definitive Guide - Cdn.oreilly.com

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

User-Defined Streaming Counters 268<br />

Sorting 268<br />

Preparation 269<br />

Partial Sort 270<br />

Total Sort 274<br />

Secondary Sort 277<br />

Joins 283<br />

Map-Side Joins 284<br />

Reduce-Side Joins 285<br />

Side Data Distribution 288<br />

Using the Job Configuration 288<br />

Distributed Cache 289<br />

MapReduce Library Classes 295<br />

9. Setting Up a <strong>Hadoop</strong> Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297<br />

Cluster Specification 297<br />

Network Topology 299<br />

Cluster Setup and Installation 301<br />

Installing Java 302<br />

Creating a <strong>Hadoop</strong> User 302<br />

Installing <strong>Hadoop</strong> 302<br />

Testing the Installation 303<br />

SSH Configuration 303<br />

<strong>Hadoop</strong> Configuration 304<br />

Configuration Management 305<br />

Environment Settings 307<br />

Important <strong>Hadoop</strong> Daemon Properties 311<br />

<strong>Hadoop</strong> Daemon Addresses and Ports 316<br />

Other <strong>Hadoop</strong> Properties 317<br />

User Account Creation 320<br />

YARN Configuration 320<br />

Important YARN Daemon Properties 321<br />

YARN Daemon Addresses and Ports 324<br />

Security 325<br />

Kerberos and <strong>Hadoop</strong> 326<br />

Delegation Tokens 328<br />

Other Security Enhancements 329<br />

Benchmarking a <strong>Hadoop</strong> Cluster 331<br />

<strong>Hadoop</strong> Benchmarks 331<br />

User Jobs 333<br />

<strong>Hadoop</strong> in the Cloud 334<br />

Apache Whirr 334<br />

Table of Contents | ix

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!