I don't know if this question is a repetition but somehow all the answers I came across don't seem to work for me (maybe I'm doing something wrong).
I have a class defined thus:
case class myRec(
time: String,
client_title: String,
made_on_behalf: Double,
country: String,
email_address: String,
phone: String)
and a sample Json file that contains records or objects in the form
[{...}{...}{...}...]
i.e
[{"time": "2015-05-01 02:25:47",
"client_title": "Mr.",
"made_on_behalf": 0,
"country": "Brussel",
"email_address": "[email protected]"},
{"time": "2015-05-01 04:15:03",
"client_title": "Mr.",
"made_on_behalf": 0,
"country": "Bundesliga",
"email_address": "[email protected]"},
{"time": "2015-05-01 06:29:18",
"client_title": "Mr.",
"made_on_behalf": 0,
"country": "Japan",
"email_address": "[email protected]"}...]
my build.sbt has libraryDependencies += "com.owlike" % "genson-scala_2.11" % "1.3" for scalaVersion := "2.11.7",
I have a scala function defined thus
//PS: Other imports already made
import com.owlike.genson.defaultGenson_
//PS: Spark context already defined
def prepData(infile:String):RDD[myRec] = {
val input = sc.textFile(infile)
//Read Json Data into my Record Case class
input.mapPartitions( records =>
records.map( record => fromJson[myRec](record))
)}
And I'm calling the function
prepData("file://path/to/abc.json")
Is there any way of doing this or is there any other Json library I can use to convert to RDD
I also tried this too and both don't seem to work
PS: I don't want to go through spark SQL to process the json file
Thanks!