I have one file as follows:
dept.txt:
1,It,pune,2017-03-12
2,CS,delhi,2017-03-21
3,mech,mum,
4,fin,pune,2017-04-15
5,It,delhi,
What I need to do is :
Read data from 2 files in 2 RDD (This I have done)
Apply filter on date column in dept file and get two outout files based on null and not null value (This I am unable to do)
How far I have been able to proceed:
val loadDept = sc.textFile("/path/to/file/dept.txt")
val cleanDept = loadDept.map(_.split(","))
val dateCol = cleanDept.filter(i => i(3) != "")
Error occurs in the last line :
java.lang.ArrayIndexOutOfBoundsException: 3
I understand that since there is an empty string/null I am getting a out of bounds exception (please correct me if I am wrong), but how to get around with it?
Note: I only need to use RDDs in Scala