Transition count within a column from one value to another value in Pandas

Question

I have the below dataframe.

df = pd.DataFrame({'Player': [1,1,1,1,2,2,2,3,3,3,4,5], "Team": ['X','X','X','Y','X','X','Y','X','X','Y','X','Y'],'Month': [1,1,1,2,1,1,2,2,2,3,4,5]})

Input:

    Player Team  Month
0        1    X      1
1        1    X      1
2        1    X      1
3        1    Y      2
4        2    X      1
5        2    X      1
6        2    Y      2
7        3    X      2
8        3    X      2
9        3    Y      3
10       4    X      4
11       5    Y      5

The data frame consists of Players, the team they belong to and the month. You can have multiple entries for the same player on a given month. Some players move from Team X to Team Y on a particular month, some don’t move at all and some directly join Team Y.

I am looking for the total count of people who moved from Team X to Team Y on a given month and the output should be like below. i.e the month of transition and total count of transitions. In this case, Players 1,2 moved on Month-2 and Player-3 moved on Month-3. Players 4 and 5 didn't move.

Expected Output:

   Month  Count
0      2      2
1      3      1

I am able to get this done in the below fashion.

###find all the people who moved from Team X to Y###
s1 = df.drop_duplicates(['Team','Player'])
s2 = s1.groupby('Player').size().reset_index(name='counts')
s2 = s2[s2['counts']>1]
####Tie them to the original df so that I can find the month in which they moved###
s3 = s1.groupby("Player").last().reset_index()
s4 = s3[s3['Player'].isin(s2['Player'])]
s5 = s4.groupby('Month').size().reset_index(name='Count')

I am pretty sure there is a better way than what I did here. Just looking for some help to make if more efficient.

Is it possible to have (1,X,1) and (1,Y,1) coexist in month=1? (i.e. the player change team within a month) — Bill Huang
– Bill Huang, Commented Oct 30, 2020 at 13:47
Currently you are dropping duplicates on Team and Player and excluding Month so is it safe to assume that player 1 would not go from X to Y back to X in a three month span? — It_is_Chris
– It_is_Chris, Commented Oct 30, 2020 at 13:52
Yes, you can assume that there is no possibility of going back again from Y to X — sharathnatraj
– sharathnatraj, Commented Oct 30, 2020 at 13:55

Bill Huang · Accepted Answer · 2020-10-30 16:33:37Z

2

First pick out the entries which (1) changes team but (2) is not the first row of a player. And then compute the size grouped by each month.

mask = df["Team"].shift().ne(df["Team"]) & df["Player"].shift().eq(df["Player"])
out = df[mask].groupby("Month").size()

Output:

print(out)  # a Series

Month
2    2
3    1
dtype: int64

# series to dataframe (optional)
out.to_frame(name="count").reset_index()

   Month  count
0      2      2
1      3      1

Edit: the first groupby in mask is redundant so removed.

edited Oct 30, 2020 at 16:33

answered Oct 30, 2020 at 14:03

Bill Huang

4,6772 gold badges15 silver badges37 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Booboo Over a year ago

Why ~df["Player"].shift().ne(df["Player"]) instead of df["Player"].shift().eq(df["Player"])?

Bill Huang Over a year ago

Yeah thanks, that can be simplified. Incorporated into the post. I was thinking that way because shift-ne is like a phrase to locate the difference in my mindset.

Booboo · Accepted Answer · 2020-10-30 16:37:00Z

1

An option is to self merge on Player, Month and check for the players that move:

s = df.drop_duplicates()

t = (s.merge(s.assign(Month=s.Month+1), on=['Player', 'Month'], how='right')
  .assign(Count=lambda x: x.Team_x.eq('Y') & x.Team_y.eq('X'))
  .groupby('Month', as_index=False)['Count'].sum()
)
print(t.loc[t['Count'] != 0])

Output:

   Month  Count
0      2      2
1      3      1

edited Oct 30, 2020 at 16:37

Booboo

45.7k4 gold badges46 silver badges74 bronze badges

answered Oct 30, 2020 at 13:45

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

Collectives™ on Stack Overflow

Transition count within a column from one value to another value in Pandas

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related