1

I have a column in dataframe as list of objects(Array of structs) like

column: [{key1:value1}, {key2:value2}, {key3:value3}]

I want to split this column into separate columns with key name as column name and value as column value in same row.
Final result like

key1:value1, key2:value2, key3:value3

How to achieve this in pyspark ?

E.g.

Sample data to create dataframe:

my_new_schema = StructType([
    StructField('id', LongType()),
    StructField('countries', ArrayType(StructType([
        StructField('name', StringType()),
        StructField('capital', StringType())
    ])))
])
l = [(1, [
        {'name': 'Italy', 'capital': 'Rome'},
        {'name': 'Spain', 'capital': 'Madrid'}
    ])
]
    
dz = spark.createDataFrame(l, schema=my_new_schema)
# we have array of structs:
dz.show(truncate=False)
+---+--------------------------------+
|id |countries                       |
+---+--------------------------------+
|1  |[{Italy, Rome}, {Spain, Madrid}]|
+---+--------------------------------+

Expected output:

+---+--------+---------+
|id |Italy   |  Spain  |
+---+------------------+
|1  |Rome    | Madrid  |
+---+--------+---------+
0

1 Answer 1

1

inline the countries array then pivot the country name column:

import pyspark.sql.functions as F

dz1 = dz.selectExpr(
    "id", 
    "inline(countries)"
).groupBy("id").pivot("name").agg(
    F.first("capital")
)

dz1.show()
#+---+-----+------+
#|id |Italy|Spain |
#+---+-----+------+
#|1  |Rome |Madrid|
#+---+-----+------+
Sign up to request clarification or add additional context in comments.

2 Comments

is it possible to do pivot on condition.. let's say if the struct has another property continent:Europe, i want to pivot the country name and capital as earlier but only if the continent value is Europe.. is this possible to do
@Manojkumar simply add a filter before grouping: dz1 = dz.selectExpr("id", "inline(countries)").filter("continent = 'Europe'")...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.