I have the below dataframe.
df = pd.DataFrame({'Player': [1,1,1,1,2,2,2,3,3,3,4,5], "Team": ['X','X','X','Y','X','X','Y','X','X','Y','X','Y'],'Month': [1,1,1,2,1,1,2,2,2,3,4,5]})
Input:
Player Team Month
0 1 X 1
1 1 X 1
2 1 X 1
3 1 Y 2
4 2 X 1
5 2 X 1
6 2 Y 2
7 3 X 2
8 3 X 2
9 3 Y 3
10 4 X 4
11 5 Y 5
The data frame consists of Players, the team they belong to and the month. You can have multiple entries for the same player on a given month. Some players move from Team X to Team Y on a particular month, some don’t move at all and some directly join Team Y.
I am looking for the total count of people who moved from Team X to Team Y on a given month and the output should be like below. i.e the month of transition and total count of transitions. In this case, Players 1,2 moved on Month-2 and Player-3 moved on Month-3. Players 4 and 5 didn't move.
Expected Output:
Month Count
0 2 2
1 3 1
I am able to get this done in the below fashion.
###find all the people who moved from Team X to Y###
s1 = df.drop_duplicates(['Team','Player'])
s2 = s1.groupby('Player').size().reset_index(name='counts')
s2 = s2[s2['counts']>1]
####Tie them to the original df so that I can find the month in which they moved###
s3 = s1.groupby("Player").last().reset_index()
s4 = s3[s3['Player'].isin(s2['Player'])]
s5 = s4.groupby('Month').size().reset_index(name='Count')
I am pretty sure there is a better way than what I did here. Just looking for some help to make if more efficient.
TeamandPlayerand excludingMonthso is it safe to assume that player 1 would not go from X to Y back to X in a three month span?