1

I have a pandas dataframe that looks like this

genrename   subgenre    subgenrename    actor
Songs       208             Dance    Aamir Khan
Songs       208             Dance    Aamir Khan
Songs       211            Romantic  Aamir Khan
Movies       1             Romantic  Aamir Khan
Songs       208             Dance    Aamir Khan
Clips        15             Scenes   Aamir Khan
Clips        15             Scenes   Aamir Khan,Salman
Clips        12            Romantic  Salman

The output dataframe that that i am trying to get would look something similar like this

Actor_Name songs    clips   movies
 AmirKhan   4          2    1
SalmanKhan  0          2    0

Can somebody guide me on this with pandas or any other data processing libraries python have?

Thanks

1 Answer 1

2

First use str.split with column actor, stack and join to original. Then pivot_table with aggfunc=len, reset_index and rename_axis (new in pandas 0.18.0):

s = df.actor.str.split(',', expand=True).stack()
s.index = s.index.droplevel(-1) 
s.name = 'actor1' 
df = df.join(s)


print df.pivot_table(index='actor1', 
                     columns='genrename', 
                     aggfunc=len, 
                     values='subgenre',
                     fill_value=0).reset_index().rename_axis(None, axis=1)

       actor1  Clips  Movies  Songs
0  Aamir Khan      2       1      4
1      Salman      2       0      0
Sign up to request clarification or add additional context in comments.

2 Comments

Sorry, give me a time.
If my answer was helpful, don't forget accept it. Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.