I am trying to process a xml file using scala and spark.
I have this schema:
root
|-- IdKey: long (nullable = true)
|-- Value: string (nullable = true)
|-- CDate: date (nullable = true)
And I want to process this xml file:
<Item>
<CDate>2018-05-08T00:00::00</CDate>
<ListItemData>
<ItemData>
<IdKey>2</IdKeyData>
<Value>1</Value>
</ItemData>
<ItemData>
<IdKey>61</IdKeyData>
<Value>2</Value>
</ItemData>
<ListItemData>
</Item>
I am using this code:
sqlContext.read.format("com.databricks.spark.xml")
.option("rowTag", "Item")
.schema(schema)
.load(xmlFile)
But my result is a table without the CDate column:
+------------+
IdKey |Value | CDate |
+------------+
|61 |1 | null
|2 |2 | null
Is possible parse the xml file with this schema ? I want to obtain this values:
+------------+
IdKey |Value | CDate |
+------------+
|61 |1 | 2018-05-08T00:00::00
|2 |2 | 2018-05-08T00:00::00
Thanks