I need to read a file stored in my project's resources, the directory is src/main/resources/dataset/dataset.dat.
I'm using the following lines of Scala code to read a text file from HDFS and parse as Spark RDD of dataset objects:
// init Spark context
val conf: SparkConf = new SparkConf().setAppName("mydataset").setMaster("local")
val sc: SparkContext = new SparkContext(conf)
// read dat file
val resource = this.getClass.getClassLoader.getResource("dataset/dataset.dat")
val dsRdd: RDD[DatasetObject] = sc.textFile(resource.toString(), 1).map(line => DatasetData.parse(line))
but the following error occurred:
class java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: jar:file:/grader/grader.jar!/dataset/dataset.dat
java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: jar:file:/grader/grader.jar!/dataset/dataset.dat
I tried to read the file in another way but the error keeps occurring:
val dsRdd: RDD[DatasetObject] = sc.textFile("src/main/resources/dataset/dataset.dat").map(line => DatasetData.parse(line))
Important: Unit tests are successfully run locally, the problem occurs on the remote test environment.
src/maindoes not exist in your JAR or after the code compiles. There is a class calledSparkFiles, I believe, which you should be using here.