Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Chapter 2<br />
HADOOP Procedure<br />
Overview: HADOOP Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21<br />
Concepts: HADOOP Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />
Using PROC HADOOP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />
Submitting Hadoop Distributed File System Commands . . . . . . . . . . . . . . . . . . . . . 22<br />
Submitting MapReduce Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />
Submitting Pig Language Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />
Syntax: HADOOP Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />
PROC HADOOP Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />
HDFS Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24<br />
MAPREDUCE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25<br />
PIG Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26<br />
Examples: HADOOP Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27<br />
Example 1: Submitting HDFS Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27<br />
Example 2: Submitting a MapReduce Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28<br />
Example 3: Submitting Pig Language Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29<br />
Overview: HADOOP Procedure<br />
PROC HADOOP enables SAS to run Apache Hadoop code against Hadoop data.<br />
Apache Hadoop is an open-source technology, written in Java, that provides data storage<br />
and distributed processing of large amounts of data.<br />
PROC HADOOP interfaces with the Hadoop JobTracker. This is the service within<br />
Hadoop that controls tasks to specific nodes in the cluster. PROC HADOOP enables you<br />
to submit the following:<br />
• HDFS commands<br />
• MapReduce programs<br />
• Pig language code<br />
21