scala - Spark error in reading file - Exception in thread “main” java.io.IOException: No input paths specified in job

I am new to Spark, Hadoop & Scala. I have a situation where I need to read from a local directory/file from Scala/Spark, and I am facing issues. I see others have come across the same issue, but I don't see a solution though.

I am using Spark 1.6.2

My code reads like this:

def main(arg: Array[String]): Unit = {

val conf = new SparkConf().setAppName("MyAppName").setMaster("local[*]")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)

val resultDf = sqlContext.read.json("/opt/app/poc/myfile.json")
}

I am getting the following error: Exception in thread "main" java.io.IOException: No input paths specified in job

Note: My application is installed and running in /opt/app/spark and I am calling it by calling /usr/bin/spark-submit --class com.mycom.TestMyApp /opt/app/spark/App.jar. I cannot move the json file inside the project jar file - the necessity is to read it from local directory.

I am not able to figure out where I am going wrong. Please help.

Here is part of the stacktrace:

>Exception in thread "main" java.io.IOException: No input paths specified in job
>>        at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:202)
>>        at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
>>        at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199)
>>        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:242)
>>        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:240)
>>        at scala.Option.getOrElse(Option.scala:120)
>>        at org.apache.spark.rdd.RDD.partitions(RDD.scala:240)
>>        at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>> ...

asked Apr 11, 2017 at 20:45

TechiRik

2,1536 gold badges28 silver badges41 bronze badges

2

Possible duplicate of Spark SQL "No input paths specified in jobs" when create DataFrame based on JSON file

vmg
– vmg

2017-04-11 21:10:39 +00:00
Commented Apr 11, 2017 at 21:10
are u running your application just locally or in a cluster of machines?

Stefan Repcek
– Stefan Repcek

2017-04-11 22:58:32 +00:00
Commented Apr 11, 2017 at 22:58
append this file:// in starting of url

Akash Sethi
– Akash Sethi

2017-04-12 02:49:54 +00:00
Commented Apr 12, 2017 at 2:49

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Spark error in reading file - Exception in thread “main” java.io.IOException: No input paths specified in job

0

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Linked