8

I am using spark-1.5.0-cdh5.6.0. tried the sample application (scala) command is:

> spark-submit --class com.cloudera.spark.simbox.sparksimbox.WordCount --master local /home/hadoop/work/testspark.jar

Got the following error:

 ERROR SparkContext: Error initializing SparkContext.
java.io.FileNotFoundException: File file:/user/spark/applicationHistory does not exist
        at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:534)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:747)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:524)
        at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:424)
        at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:100)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:541)
        at com.cloudera.spark.simbox.sparksimbox.WordCount$.main(WordCount.scala:12)
        at com.cloudera.spark.simbox.sparksimbox.WordCount.main(WordCount.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

2 Answers 2

15

Spark has a feature called "history server" which allows you to browse historical events after the SparkContext dies. This property is set via setting spark.eventLog.enabled to true.

You have two options, either specify a valid directory to store the event log via the spark.eventLog.dir config value, or simply set spark.eventLog.enabled to false if you don't need it.

You can read more on that in the Spark Configuration page.

Sign up to request clarification or add additional context in comments.

2 Comments

@G.Saleh Glad it helped.
I clicked without noticing
0

I got the same error which working with nltk in spark, To fix this I just removed all the nltk related properties from spark-conf.default.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.