I have a pyspark dataframe which looks like this-
| RowNumber | value |
|---|---|
| 1 | [{[email protected], Name=abc}, {[email protected], Name=mnc}] |
| 2 | [{[email protected], Name=klo}, {[email protected], Name=mmm}] |
The column "value" is of string type.
root
|--value: string (nullable = false)
|--rowNumber: integer (nullable = false)
Step 1 I need to explode the dictionaries inside the list on each row under column "value" like this-
Step 2 And then to further explode the column so that the resultant table looks like :
Although When I try to get to Step1 using:
df.select(explode(col('value')).alias('value'))
it shows me error:
Analysis Exception: cannot resolve 'explode("value")' due to data type mismatch: input to function explode should be array or map type, not string
How do I convert this string under column 'value' to compatible data types so that I can proceed with exploding the dictionary elements as valid array/json (step1) and then into separate columns (step2) ?
please help

