My dataset has multiple columns which include lists.
A B First_val Second_value Colour
0 aaa ['hello', 'world'] ['bla','bla'] ['blue'] Red
1 sad ['aaa'] ['cat'] ['hi'] Green
2 vas ['min','max'] ['avg'] [] Yellow
...
The colour should be assigned only based on A value: this means that, if I have multiple rows because of exploding lists, I should have red all the times, then green, then yellow for all the rows having in A aaa, sad and vas respectively.
I have tried with
(df.set_index('A')
.apply(lambda x: x.apply(pd.Series).stack())
.reset_index()
.drop('level_1', 1))
but the value of Colour is not assigned per all A's values, but only for the first one (the others are NaN). Example (wrong):
A B First_val Second_value Colour
0 aaa hello bla blue Red
1 aaa world bla nan nan
...
Expected output:
A B First_val Second_value Colour
0 aaa hello bla blue Red
1 aaa world bla nan Red
...
How can I get it?