I have a dataframe like this
+---+-------------------------+-------+
id | sentence | value |
+---+-------------------------+-------+
1 |dogs love fruits | 50|
1 |Mary is in the kitchen | 45|
2 |John eats fruits | 56|
+---+-------------------------+-------+
I have a list of words, and I would like to create multiple columns based on sting matching, the output would look like this
List = ["fruits", "kitchen"]
+---+-------------------------+-------+-----------+------------+------------+-------------+
id | sentence | value |fruits_bool|fruits_value|kitchen_bool|kitchen_value
+---+-------------------------+-------+-----------+------------+------------+-------------+
1 |dogs love fruits | 50|1 |50 |0 |0
1 |Mary is in the kitchen | 45|0 |0 |1 |45
2 |John eats fruits | 56|1 |56 |0 |0
+---+-------------------------+-------+-----------+------------+------------+-------------+
I can easily create these columns word by word, but I would like to do automatically based on the list. Here's the code I tried
groupby("id"). \
pivot('sentence', List). \
agg(
F.count('sentence').alias('bool')
F.max('value').alias('value')
). \
fillna(0)
I get an empty dataframe. What is the best way to do it ?