cloudera-spark

More documents

Recommendations

Info

Running Spark Applications Cluster mode is not well suited to using Spark interactively. Spark applications that require user input, such as spark-shell and pyspark, require the Spark driver to run inside the client process that initiates the Spark application. Client Deployment Mode In client mode, the Spark driver runs on the host where the job is submitted. The ApplicationMaster is responsible only for requesting executor containers from YARN. After the containers start, the client communicates with the containers to schedule work. 36 | Spark Guide
Running Spark Applications Table 3: Deployment Mode Summary Mode Driver runs in Requests resources Starts executor processes Persistent services Supports Spark Shell YARN Client Mode Client ApplicationMaster YARN NodeManager YARN ResourceManager and NodeManagers Yes YARN Cluster Mode ApplicationMaster ApplicationMaster YARN NodeManager YARN ResourceManager and NodeManagers No Configuring the Environment Spark requires that the HADOOP_CONF_DIR or YARN_CONF_DIR environment variable point to the directory containing the client-side configuration files for the cluster. These configurations are used to write to HDFS and connect to the YARN ResourceManager. If you are using a Cloudera Manager deployment, these variables are configured automatically. If you are using an unmanaged deployment, ensure that you set the variables as described in Running Spark on YARN. Running a Spark Shell Application on YARN To run the spark-shell or pyspark client on YARN, use the --master yarn --deploy-mode client flags when you start the application. Spark Guide | 37
Page 1 and 2: Spark Guide
Page 3 and 4: Table of Contents Apache Spark Over
Page 5 and 6: Apache Spark Overview Apache Spark
Page 7 and 8: Running Your First Spark Applicatio
Page 9 and 10: Developing Spark Applications Devel
Page 11 and 12: Developing Spark Applications } } /
Page 13 and 14: Developing Spark Applications spark
Page 15 and 16: Developing Spark Applications gb gb
Page 17 and 18: If Spark does not have the required
Page 19 and 20: Developing Spark Applications | tab
Page 21 and 22: Developing Spark Applications 15/07
Page 23 and 24: Developing Spark Applications 1. Sp
Page 25 and 26: Developing Spark Applications • B
Page 27 and 28: Developing Spark Applications |-- r
Page 29 and 30: Developing Spark Applications Creat
Page 31 and 32: Developing Spark Applications conf.
Page 33 and 34: Running Spark Applications Running
Page 35: Running Spark Applications Table 2:
Page 39 and 40: Running Spark Applications 2. Run t
Page 41 and 42: Running Spark Python Applications A
Page 43 and 44: Installing and Maintaining Python E
Page 45 and 46: Running Spark Applications At each
Page 47 and 48: Running Spark Applications To avoid
Page 49 and 50: • Running tiny executors (with a
Page 51 and 52: Spark and Hadoop Integration Spark
Page 53 and 54: Appendix: Apache License, Version 2
Page 55: While redistributing the Work or De

cloudera-spark

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?