2

Hi I have data like this:

for example

id        genre    total_play
1         pop      20
1         rock     30
1         jazz     60
2         pop      60
2         country  30
2         rock     25
3         latin    25
3         kpop     25
3         folk     10

I want to create a new column based on the following rules:

  • If a user listens to jazz music for more than 30% of the total play it will be labeled with category A
  • If a user listens to pop music for more than 40% of the total play it will be labeled with category B
  • Other than that then C

and will be like this:

id   tendency
1    A
2    B
3    C

Thanks before :)

1 Answer 1

3

Let's try pivot the table to calculate the total plays, then use np.select:

plays  = df.pivot_table('total_play','id','genre',fill_value=0)
totals = plays.sum(1)

pd.Series(np.select([plays['jazz']>totals*0.3, plays['pop'] > totals*0.4],['A','B'],'C'),
          index=plays.index)

Output:

id
1    A
2    B
3    C
dtype: object
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.