1

I am attempting to connect to a standalone spark server from a java application using the following code

SparkConf sparkConf_new = new SparkConf()
    .setAppName("Example Spark App")
    .setMaster("spark://my.server.com:7077");
JavaSparkContext sparkContext = new JavaSparkContext(sparkConf_new);
JavaRDD<String> stringJavaRDD = sparkContext.textFile("hdfs://cluster/my/path/test.csv");
out.println("Number of lines in file = " + stringJavaRDD.count());

I am receiving the following error

An exception occurred at line 12

12: SparkConf sparkConf_new = new SparkConf()
13:     .setAppName("Example Spark App")
14:     .setMaster("spark://my.server.com:7077");
15: JavaSparkContext sparkContext = new JavaSparkContext(sparkConf_new);
16: JavaRDD<String> stringJavaRDD = sparkContext.textFile("hdfs://cluster/my/path/test.csv");
17: out.println("Number of lines in file = " + stringJavaRDD.count());

java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.util.Utils$
    at org.apache.spark.SparkConf.<init>(SparkConf.scala:59)
    at org.apache.spark.SparkConf.<init>(SparkConf.scala:53)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:123)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:54)

Included are:

scala-library-2.10.5.jar
spark-core_2.10-1.6.0.jar
hadoop-core-1.2.1.jar
5
  • which version of spark and hadoop are you using? Commented Nov 23, 2016 at 13:59
  • You're missing the original cause of your exception, which should be a previous ExceptionIninitializerError for the Util$ class. Commented Nov 23, 2016 at 14:02
  • that is the reasson I'm here; no where in my logs does that exist which is why I've asked, normally I would see this. @Nirmal Ram backend Spark 1.6.0 and Hadoop 2.7 Commented Nov 23, 2016 at 14:22
  • Just noticed the hadoop core version updated that to 1.2.1 same issue though Commented Nov 23, 2016 at 14:34
  • On the server you are connecting to can you open the spark-shell successfully? Commented Nov 23, 2016 at 19:44

2 Answers 2

1

You typically package your application into an Uber JAR file and use $SPARK_HOME/bin/spark-submit script to send it to the server for execution.

If you can try creating the most simple applicaiton to start with, using Maven all you should need in your project dependencies is

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.11</artifactId>
</dependency>

Doing it this way all of your environmental config (server url, etc) can be defined out side of your Java code in a script making it more portable.

Sign up to request clarification or add additional context in comments.

Comments

0

If you write application in spark, even if you are sending it to remote cluster the three jars are not enough. You should add all spark dependencies to the classpath of the application.

The easiest way is to use maven or gradle (see http://spark.apache.org/docs/1.6.3/programming-guide.html#linking-with-spark) which will include spark and all its transitive dependencies. If you cannot use build system, download an official spark build and add all jars in jars/ directory to the classpath of your application.

1 Comment

Plus hadoop core should be replaced with hadoop common jar

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.