I have some incoming data as rowValues, I will have to apply a particular schema and create a data frame , here is my code:
val rowValues = List("12","F","1980-10-11,1980-10-11T10:10:20")
val rdd = sqlContext.sparkContext.parallelize(Seq(rowValues))
val rowRdd = rdd.map(v => Row(v: _*))
var fieldSchema = ListBuffer[StructField]()
fieldSchema += StructField("C0", IntegerType, true, null)
fieldSchema += StructField("C1", StringType, true, null)
fieldSchema += StructField("C2", TimestampType, true, null)
val schema = StructType(fieldSchema.toList)
val newRow = sqlContext.createDataFrame(rowRdd, schema)
newRow.printSchema() // new schema prints here
newRow.show() // This fails with ClassCast exception
This fails with org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 16.0 failed 1 times, most recent failure: Lost task 0.0 in stage 16.0 (TID 16, localhost): java.lang.ClassCastException: java.lang.String cannot be cast to java.sql.Timestamp
How do I apply this schema?