Using spark dataframe.
scala> val df_input = Seq( ("p1", """{"a": 1, "b": 2}"""), ("p2", """{"c": 3}""") ).toDF("p_id", "p_meta")
df_input: org.apache.spark.sql.DataFrame = [p_id: string, p_meta: string]
scala> df_input.show()
+----+----------------+
|p_id| p_meta|
+----+----------------+
| p1|{"a": 1, "b": 2}|
| p2| {"c": 3}|
+----+----------------+
Given this input df, is it possible to split it by json key to create a new df_output like the output below?
df_output =
p_id p_meta_key p_meta_value
p1 a 1
p1 b 2
p2 c 3
I am using spark version 3.0.0 / scala 2.12.x . and I prefer to using spark.sql.functions._