I am new to Scala and Spark. I am trying to do some simple program where I want to remove a row which has Empty/NUll values(without using DataFrame). I tried to do it with filter but it's not working. Can you please tell where I am making the mistake ?
Data:
Bypass Road (film),2019,137,Drama|Thriller,7.1,51
Satellite Shankar,2019,135,Action|Drama,4.6,34
Jhalki,2019,0,Drama,,
Marjaavaan,2019,0,Action|Romance,,
Motichoor Chaknachoor,2019,150,Comedy|Romance,,
Keep Safe Distance (film),2019,0,Action|Thriller,,
I am trying to remove rows with empty value like Keep Safe Distance (film),2019,0,Action|Thriller,,
Code:
val sparkSession = SparkSession.builder().appName("MovieAnalyzer")
.master("local")
.getOrCreate()
val dataRDD = sparkSession.sparkContext.textFile("src/test/resources/movies.csv")
// remove the header
val header = dataRDD.first()
val movie_list = dataRDD.filter(line => line != header).filter(r => !r.contains("")).map(r => r.replace("\\N","0"))
movie_list.collect().foreach(println)
The above code is not printing any data from the csv file. Please let me know what is the problem with my code