I am trying to get the complex data into normal dataframe format
My data schema:
root
|-- column_names: array (nullable = true)
| |-- element: string (containsNull = true)
|-- values: array (nullable = true)
| |-- element: array (containsNull = true)
| | |-- element: string (containsNull = true)
|-- id: array (nullable = true)
| |-- element: string (containsNull = true)
My Data File(JSON Format):
{"column_names":["2_col_name","3_col_name"],"id":["a","b","c","d","e"],"values":[["2_col_1",1],["2_col_2",2],["2_col_3",9],["2_col_4",10],["2_col_5",11]]}
I am trying to convert above data into this format:
+----------+----------+----------+
|1_col_name|2_col_name|3_col_name|
+----------+----------+----------+
| a| 2_col_1| 1|
| b| 2_col_2| 2|
| c| 2_col_3| 9|
| d| 2_col_4| 10|
| e| 2_col_5| 11|
+----------+----------+----------+
I tried using explode function on id and values but got different output as below:
+---+-------------+
| id| values|
+---+-------------+
| a| [2_col_1, 1]|
| a| [2_col_2, 2]|
| a| [2_col_3, 9]|
| a|[2_col_4, 10]|
| a|[2_col_5, 11]|
| b| [2_col_1, 1]|
| b| [2_col_2, 2]|
| b| [2_col_3, 9]|
| b|[2_col_4, 10]|
+---+-------------+
only showing top 9 rows
Not sure where i am doing wrong