5

I have installed spark 2.2 with winutils in windows 10.when i am going to run pyspark i am facing bellow exception

pyspark.sql.utils.IllegalArgumentException: "Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder'

I have already tried permission 777 commands in tmp/hive folder as well.but it is not working for now

winutils.exe chmod -R 777 C:\tmp\hive

after applying this the problem remains same. I am using pyspark 2.2 in my windows 10. Her is spark-shell env enter image description here

Here is pyspark shell enter image description here

Kindly help me to figure out Thankyou

4
  • thanks Jacek for your reply. i was trying your instruction in my configuration.i got success installing in my home computer. ok here it is the Caused by: org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.net.ConnectException: Call From DESKTOP-SDNSD47/192.168.10.143 to 0.0.0.0:9000 failed on connection exception: java.net.ConnectException: Connection refused: I am getting this as well. kindly help me Commented Jul 19, 2017 at 11:09
  • Thanks a lot Jacek . there was a Hdfs conf folder path that i have created in my user variable for earlier experiment. i have deleted that and pyspark is working :) thanks a lot and sorry for disturbing you. i am learning spark but today I have learnt how to fix a weird exception like this from you. Thanks a lot Commented Jul 20, 2017 at 8:35
  • We all learn here. I used our conversation to answer your question for future reference. Please accept if it matches what helped you solve the issue. Thanks. Commented Jul 20, 2017 at 12:09
  • Possible duplicate of Spark 2.1 - Error While instantiating HiveSessionState Commented Feb 14, 2018 at 4:22

8 Answers 8

4

I had the same problem using the command 'pyspark' as well as 'spark-shell' (for scala) in my mac os with apache-spark 2.2. Based on some research I figured its because of my JDK version 9.0.1 which does not work well with Apache-Spark. Both errors got resolved by switching back from Java JDK 9 to JDK 8.

Maybe that might help with your windows spark installation too.

Sign up to request clarification or add additional context in comments.

Comments

1

Port 9000?! It must be something Hadoop-related as I don't remember the port for Spark. I'd recommend using spark-shell first that would eliminate any additional "hops", i.e. spark-shell does not require two runtimes for Spark itself and Python.

Given the exception I'm pretty sure that the issue is that you've got some Hive- or Hadoop-related configuration somewhere lying around and Spark uses it apparently.

The "Caused by" seems to show that 9000 is used when Spark SQL is created which is when Hive-aware subsystem is loaded.

Caused by: org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.net.ConnectException: Call From DESKTOP-SDNSD47/192.168.10.143 to 0.0.0.0:9000 failed on connection exception: java.net.ConnectException: Connection refused

Please review the environment variables in Windows 10 (possibly using set command on command line) and remove anything Hadoop-related.

Comments

1

Posting this answer for posterity. I faced the same error. The way i solved it is by first trying out spark-shell instead of pyspark. The error message was more direct.

This gave a better idea; there was S3 access error. Next; i checked the ec2 role/instance profile for that instance; it has S3 administrator access.

Then i did a grep for s3:// in all the conf files under /etc/ directory. Then i found that in core-site.xml there is a property called

<!-- URI of NN. Fully qualified. No IP.--> <name>fs.defaultFS</name> <value>s3://arvind-glue-temp/</value> </property>

Then i remembered. I had removed HDFS as the default file system and set it to S3. I had created the ec2 instance from an earlier AMI and had forgotten to update the S3 bucket corresponding to the newer account.

Once i updated the s3 bucket to the one which is accessible by the current ec2 instance profile; it worked.

Comments

0

To use Spark on Windows OS, you may follow this guide.

NOTE: Ensure that you have correctly resolved your IP address against your hostname as well as localhost, lack of localhost resolution has caused problems for us in the past.

Also, you should provide the full stack trace as it helps to debug the issue quickly and saves the guesswork.

Let me know if this helps. Cheers.

Comments

0

Try this . It worked for me!. Open up a command prompt in administrator mode and then run the command 'pyspark'. This should help open a spark session without errors.

Comments

0

I also come across the error in Unbuntu 16.04:

raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.IllegalArgumentException: u"Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder'

this is because I have already run ./bin/spark-shell

So, just kill that spark-shell, and re-run ./bin/pyspark

Comments

0

I also come across the error in MacOS10, and I solved this by use Java8 instead of Java9.

When Java 9 is the default version getting resolved in the environment, pyspark will throw error below and you will see name 'xx' is not defined error when trying to access sc, spark etc. from shell / Jupyter.

more details you can see this link

Comments

0

You must have hive-site.xml file in the spark configuration directory. Change the port from 9000 to 9083 resolved the problem for me.

Please ensure that the property is updated in both the hive-site.xml files which would be placed under hive config and spark config directory.

<property>
    <name>hive.metastore.uris</name>
    <value>thrift://localhost:9083</value>
    <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>   </property>

For me in ubuntu, the location for hive-site.xml are:

/home/hadoop/hive/conf/

and

/home/hadoop/spark/conf/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.