1

I am trying to install and run PySpark in Jupyter notebook on AWS ElasticMapReduce (EMR). As you can see

%%info

Current session configs: {'driverMemory': '1000M', 'executorCores': 2, 'kind': 'pyspark'}
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("docker-numpy").getOrCreate()
sc = spark.sparkContext

Output

The code failed because of a fatal error:
    Unable to create Session. Error: Unexpected endpoint: http://172.31.3.115:8998.

Some things to try:
a) Make sure Spark has enough available resources for Jupyter to create a Spark context.
b) Contact your Jupyter administrator to make sure the Spark magics library is configured correctly.
c) Restart the kernel.

where 172.31.3.115 is my master internal/private IP. I have made the following changes to notebook@ip-x-x-x-x$ more .sparkmagic/config as follows

{
  "kernel_python_credentials" : {
    "username": "",
    "password": "",
    "url": "http://172.31.3.115:8998",
    "auth": "None"
  },

  "kernel_scala_credentials" : {
    "username": "",
    "password": "",
    "url": "http://172.31.3.115:8998",
    "auth": "None"
  },
  "kernel_r_credentials": {
    "username": "",
    "password": "",
    "url": "http://172.31.3.115:8998"
  },

  "logging_config": {
    "version": 1,
    "formatters": {
      "magicsFormatter": { 
        "format": "%(asctime)s\t%(levelname)s\t%(message)s",
        "datefmt": ""
      }
    },
    "handlers": {
      "magicsHandler": { 
        "class": "hdijupyterutils.filehandler.MagicsFileHandler",
        "formatter": "magicsFormatter",
        "home_path": "~/.sparkmagic"
      }
    },
    "loggers": {
      "magicsLogger": { 
        "handlers": ["magicsHandler"],
        "level": "DEBUG",
        "propagate": 0
      }
    }
  },

  "wait_for_idle_timeout_seconds": 15,
  "livy_session_startup_timeout_seconds": 60,

  "fatal_error_suggestion": "The code failed because of a fatal error:\n\t{}.\n\nSome things to try:\na) Make sure Spark has enough available resources for Jupyter to create a Spark context.\nb) Contact your Jupyter administrator to make sure the Spark magics library is configured correctly.\nc) Restart the kernel.",

  "ignore_ssl_errors": false,

  "session_configs": {
    "driverMemory": "1000M",
    "executorCores": 2
  },

  "use_auto_viz": true,
  "coerce_dataframe": true,
  "max_results_sql": 2500,
  "pyspark_dataframe_encoding": "utf-8",

  "heartbeat_refresh_seconds": 30,
  "livy_server_heartbeat_timeout_seconds": 0,
  "heartbeat_retry_seconds": 10,

  "server_extension_default_kernel_name": "pysparkkernel",
  "custom_headers": {},

  "retry_policy": "configurable",
  "retry_seconds_to_sleep_list": [0.2, 0.5, 1, 3, 5],
  "configurable_retry_policy_max_retries": 8
}

Like many others, I have tried 1, 2. First of all I am not able to locate SPARK_HOME on EMR. I have a question too, how do I install Livy on EMR or set Advanced Cluster Options? I am creating cluster manually using aws-cli as follows

aws emr create-cluster \
 --name 'EMR 6.0.0 with Docker' \
 --release-label emr-6.0.0 \
 --applications Name=Livy Name=Spark Name=Hadoop Name=JupyterHub \
 --ec2-attributes "KeyName=sowmya_private_key,SubnetId=subnet-b39550d8" \
 --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m5.xlarge InstanceGroupType=CORE,InstanceCount=2,InstanceType=m5.xlarge \
 --use-default-roles \
 --configurations file://./emr-configuration.json

which tells me that the following cluster is up

{
    "ClusterId": "j-3T56U7A09JWAD"
}

I have been following these links/tutorials from AWS

https://aws.amazon.com/blogs/machine-learning/build-amazon-sagemaker-notebooks-backed-by-spark-in-amazon-emr/

and

https://aws.amazon.com/blogs/big-data/simplify-your-spark-dependency-management-with-docker-in-emr-6-0-0/

Not bothering much about privacy, here is a big vomit of the error log

The code failed because of a fatal error:
    Session 1 unexpectedly reached final status 'dead'. See logs:
stdout: 

stderr: 
20/06/06 04:05:15 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/06/06 04:05:16 INFO RMProxy: Connecting to ResourceManager at ip-172-31-3-115.us-east-2.compute.internal/172.31.3.115:8032
20/06/06 04:05:16 INFO Client: Requesting a new application from cluster with 2 NodeManagers
20/06/06 04:05:16 INFO Configuration: resource-types.xml not found
20/06/06 04:05:16 INFO ResourceUtils: Unable to find 'resource-types.xml'.
20/06/06 04:05:16 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (12288 MB per container)
20/06/06 04:05:16 INFO Client: Will allocate AM container, with 2432 MB memory including 384 MB overhead
20/06/06 04:05:16 INFO Client: Setting up container launch context for our AM
20/06/06 04:05:16 INFO Client: Setting up the launch environment for our AM container
20/06/06 04:05:16 INFO Client: Preparing resources for our AM container
20/06/06 04:05:16 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
20/06/06 04:05:18 INFO Client: Uploading resource file:/mnt/tmp/spark-0cd5b0e0-9c69-4105-835f-ce1c484787d4/__spark_libs__3675935773843248835.zip -> hdfs://ip-172-31-3-115.us-east-2.compute.internal:8020/user/livy/.sparkStaging/application_1591413438501_0002/__spark_libs__3675935773843248835.zip
20/06/06 04:05:18 INFO Client: Uploading resource file:/usr/lib/livy/rsc-jars/livy-api-0.6.0-incubating.jar -> hdfs://ip-172-31-3-115.us-east-2.compute.internal:8020/user/livy/.sparkStaging/application_1591413438501_0002/livy-api-0.6.0-incubating.jar
20/06/06 04:05:18 INFO Client: Uploading resource file:/usr/lib/livy/rsc-jars/livy-rsc-0.6.0-incubating.jar -> hdfs://ip-172-31-3-115.us-east-2.compute.internal:8020/user/livy/.sparkStaging/application_1591413438501_0002/livy-rsc-0.6.0-incubating.jar
20/06/06 04:05:18 INFO Client: Uploading resource file:/usr/lib/livy/rsc-jars/netty-all-4.1.17.Final.jar -> hdfs://ip-172-31-3-115.us-east-2.compute.internal:8020/user/livy/.sparkStaging/application_1591413438501_0002/netty-all-4.1.17.Final.jar
20/06/06 04:05:18 INFO Client: Uploading resource file:/usr/lib/livy/repl_2.12-jars/commons-codec-1.9.jar -> hdfs://ip-172-31-3-115.us-east-2.compute.internal:8020/user/livy/.sparkStaging/application_1591413438501_0002/commons-codec-1.9.jar
20/06/06 04:05:19 INFO Client: Uploading resource file:/usr/lib/livy/repl_2.12-jars/livy-core_2.12-0.6.0-incubating.jar -> hdfs://ip-172-31-3-115.us-east-2.compute.internal:8020/user/livy/.sparkStaging/application_1591413438501_0002/livy-core_2.12-0.6.0-incubating.jar
20/06/06 04:05:19 INFO Client: Uploading resource file:/usr/lib/livy/repl_2.12-jars/livy-repl_2.12-0.6.0-incubating.jar -> hdfs://ip-172-31-3-115.us-east-2.compute.internal:8020/user/livy/.sparkStaging/application_1591413438501_0002/livy-repl_2.12-0.6.0-incubating.jar
20/06/06 04:05:19 INFO Client: Uploading resource file:/usr/lib/spark/R/lib/sparkr.zip#sparkr -> hdfs://ip-172-31-3-115.us-east-2.compute.internal:8020/user/livy/.sparkStaging/application_1591413438501_0002/sparkr.zip
20/06/06 04:05:19 INFO Client: Uploading resource file:/usr/lib/spark/python/lib/pyspark.zip -> hdfs://ip-172-31-3-115.us-east-2.compute.internal:8020/user/livy/.sparkStaging/application_1591413438501_0002/pyspark.zip
20/06/06 04:05:19 INFO Client: Uploading resource file:/usr/lib/spark/python/lib/py4j-0.10.7-src.zip -> hdfs://ip-172-31-3-115.us-east-2.compute.internal:8020/user/livy/.sparkStaging/application_1591413438501_0002/py4j-0.10.7-src.zip
20/06/06 04:05:19 WARN Client: Same name resource file:///usr/lib/spark/python/lib/pyspark.zip added multiple times to distributed cache
20/06/06 04:05:19 WARN Client: Same name resource file:///usr/lib/spark/python/lib/py4j-0.10.7-src.zip added multiple times to distributed cache
20/06/06 04:05:19 INFO Client: Uploading resource file:/mnt/tmp/spark-0cd5b0e0-9c69-4105-835f-ce1c484787d4/__spark_conf__7110997886244851568.zip -> hdfs://ip-172-31-3-115.us-east-2.compute.internal:8020/user/livy/.sparkStaging/application_1591413438501_0002/__spark_conf__.zip
20/06/06 04:05:20 INFO SecurityManager: Changing view acls to: livy
20/06/06 04:05:20 INFO SecurityManager: Changing modify acls to: livy
20/06/06 04:05:20 INFO SecurityManager: Changing view acls groups to: 
20/06/06 04:05:20 INFO SecurityManager: Changing modify acls groups to: 
20/06/06 04:05:20 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(livy); groups with view permissions: Set(); users  with modify permissions: Set(livy); groups with modify permissions: Set()
20/06/06 04:05:21 INFO Client: Submitting application application_1591413438501_0002 to ResourceManager
20/06/06 04:05:21 INFO YarnClientImpl: Submitted application application_1591413438501_0002
20/06/06 04:05:21 INFO Client: Application report for application_1591413438501_0002 (state: ACCEPTED)
20/06/06 04:05:21 INFO Client: 
     client token: N/A
     diagnostics: [Sat Jun 06 04:05:21 +0000 2020] Application is Activated, waiting for resources to be assigned for AM.  Details : AM Partition = <DEFAULT_PARTITION> ; Partition Resource = <memory:24576, vCores:8> ; Queue's Absolute capacity = 100.0 % ; Queue's Absolute used capacity = 0.0 % ; Queue's Absolute max capacity = 100.0 % ; Queue's capacity (absolute resource) = <memory:24576, vCores:8> ; Queue's used capacity (absolute resource) = <memory:0, vCores:0> ; Queue's max capacity (absolute resource) = <memory:24576, vCores:8> ; 
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1591416321309
     final status: UNDEFINED
     tracking URL: http://ip-172-31-3-115.us-east-2.compute.internal:20888/proxy/application_1591413438501_0002/
     user: livy
20/06/06 04:05:21 INFO ShutdownHookManager: Shutdown hook called
20/06/06 04:05:21 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-0cd5b0e0-9c69-4105-835f-ce1c484787d4
20/06/06 04:05:21 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-d83d52f6-d17d-4e29-a562-7013ed539e1a

YARN Diagnostics: 
Application application_1591413438501_0002 failed 1 times (global limit =2; local limit is =1) due to AM Container for appattempt_1591413438501_0002_000001 exited with  exitCode: 7
Failing this attempt.Diagnostics: [2020-06-06 04:05:25.619]Exception from container-launch.
Container id: container_1591413438501_0002_01_000001
Exit code: 7
Exception message: Launch container failed
Shell error output: Unable to find image '839713865431.dkr.ecr.us-east-2.amazonaws.com/emr-docker-examples:pyspark-latest' locally
/usr/bin/docker: Error response from daemon: manifest for 839713865431.dkr.ecr.us-east-2.amazonaws.com/emr-docker-examples:pyspark-latest not found: manifest unknown: Requested image not found.
See '/usr/bin/docker run --help'.

Shell output: main : command provided 4
main : run as user is hadoop
main : requested yarn user is livy
Creating script paths...
Creating local dirs...
Getting exit code file...
Changing effective user to root...
Wrote the exit code 7 to /mnt/yarn/nmPrivate/application_1591413438501_0002/container_1591413438501_0002_01_000001/container_1591413438501_0002_01_000001.pid.exitcode


[2020-06-06 04:05:25.645]Container exited with a non-zero exit code 7. Last 4096 bytes of stderr.txt :


[2020-06-06 04:05:25.646]Container exited with a non-zero exit code 7. Last 4096 bytes of stderr.txt :


For more detailed output, check the application tracking page: http://ip-172-31-3-115.us-east-2.compute.internal:8088/cluster/app/application_1591413438501_0002 Then click on links to logs of each attempt.
. Failing the application..

Some things to try:
a) Make sure Spark has enough available resources for Jupyter to create a Spark context.
b) Contact your Jupyter administrator to make sure the Spark magics library is configured correctly.
c) Restart the kernel.

1 Answer 1

2

I usually use the following steps to create a cluster:

  1. Create an EMR cluster using AWS Management Console.

  2. Choose emr-5.25.0.

  3. The only application I choose is Spark.

  4. Add the following configuration to apply Python 3 by default:

    [
      {
        "Classification": "spark-env",
        "Configurations": [
          {
            "Classification": "export",
            "Properties": {
               "PYSPARK_PYTHON": "/usr/bin/python3"
            }
          }
        ]
      }
    ]
    
  5. Click Create cluster.

  6. Open a terminal session to SSH into the master node and install jupyterlab:

    sudo pip-3.6 install jupyterlab
    
  7. Start jupyerlab:

    export PYSPARK_DRIVER_PYTHON=$(which jupyter)
    export PYSPARK_DRIVER_PYTHON_OPTS="lab --ip=0.0.0.0"
    
    pyspark --master yarn --driver-memory 8g --executor-memory 20g --executor-cores 4
    
  8. Open second terminal session to start a SSH tunnel to the master node:

    ssh -i /path/to/ssh/key.pem -ND 8157 hadoop@master-ip-address
    

That's it.

Sign up to request clarification or add additional context in comments.

1 Comment

Hey thanks a lot for the reply. I would keep your process in mind. For time being, I stopped working on EMR. Nonetheless, thank you so much for your efforts.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.