I have a JSON file and I want to do some ETL tasks. I want to extract a column and append its values as new rows in the data frame. for example, if I have a data frame like this:
-----------------------------------------------------------------
|name | last | father |
-----------------------------------------------------------------
| daniel | allardice | {'name': 'george', 'last': 'allardice'} |
-----------------------------------------------------------------
I want to turn it to:
----------------------------
| name | last |
----------------------------
| daniel | allardice |
----------------------------
| george | allardice |
----------------------------
How can I do this by UDF in PySpark?