15

Hello I was working with Pyspark, implementing a sentiment analysis project using ML package for the first time. The code was working good but suddenly it becomes showing the error mentioned above:

   ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:50532)
Traceback (most recent call last):
  File "C:\opt\spark\spark-2.3.0-bin-hadoop2.7\python\lib\py4j-0.10.6-src.zip\py4j\java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\opt\spark\spark-2.3.0-bin-hadoop2.7\python\lib\py4j-0.10.6-src.zip\py4j\java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [WinError 10061] Aucune connexion n’a pu être établie car l’ordinateur cible l’a expressément refusée

Does someone can help please Here is the full error description?

2
  • I get this error when trying to initialize SparkContext from the shell. SparkContext is created automatically in the shell. Commented Aug 7, 2018 at 12:27
  • In my case am working in jupyter notebook so i am obliged to initialize manually sparkcontext Commented Aug 10, 2018 at 14:06

5 Answers 5

13

Just restart your notebook if you are using Jupyter nootbook. If not then just restart the pyspark . that should solve the problem. It happens because you are using too many collects or some other memory related issue.

Sign up to request clarification or add additional context in comments.

Comments

9

Add more resources to Spark. For example if you're working on local mode a configuration like the following should be sufficient:

spark = SparkSession.builder \
.appName('app_name') \
.master('local[*]') \
.config('spark.sql.execution.arrow.pyspark.enabled', True) \
.config('spark.sql.session.timeZone', 'UTC') \
.config('spark.driver.memory','32G') \
.config('spark.ui.showConsoleProgress', True) \
.config('spark.sql.repl.eagerEval.enabled', True) \
.getOrCreate()

Comments

7

I encountered this error while trying to use PySpark within a Docker container. In my case, the error was originating from me assigning more resources to Spark than Docker had access to.

2 Comments

How to go solve this issue if I'm running it on production. Like do I need to flush the ram and restart the application again or anything else?
If you have been able to successfully run the application once, perhaps restarting it will help. In my case, I just couldn't get it to run, even once. Eventually ended up reducing the spark driver memory to what could safely fit within the container.
0

I encountered the same problem while working on colab. I terminated the current session and reconnected. It worked for me!

Comments

0

Maybe the port of spark UI is already occupied, maybe there are other errors before this error.

Maybe this can help you:https://stackoverflow.com/questions/32820087/spark-multiple-spark-submit-in-parallel

spark-submit --conf spark.ui.port=5051

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.