0

I have a dataframe with columns 'gender', and 'year'. I need to get the ratio of female to males based on the year. Below is a sample dataframe.

data = {'gender' : ['m', 'm', 'm', 'f', 'm', 'f'],
     'Year' : ['2000', '2000', '2003', '2000', '2001', '2001']}
my_df = pd.DataFrame (data, columns = ['gender','Year'])
my_df = my_df.sort_values('Year')   #trial
print(my_df )

My output should be:

data = { 'Year' : ['2000', '2001', '2003'],
     'ratio' : [0.33,0.5,0]}
my_df = pd.DataFrame (data, columns = ['Year', 'ratio'])
print(my_df)

This is what I tried: I first sort the dataframe based on year so that it is easier to get the total count. But I am not sure how to get the number of males in that specific year.

2
  • @jezrael, total participants is 3 in the year 2001 and number of female participants is 1. So 1/3, i.e 0.33 Commented Nov 23, 2020 at 9:16
  • Thank you, so need my second solution. Commented Nov 23, 2020 at 9:17

1 Answer 1

1

Use crosstab first and for ration of counts divide columns:

df = pd.crosstab(my_df['Year'], my_df['gender'])
df['ratio'] = df['f'].div(df['m'])
print(df )
gender  f  m  ratio
Year               
2000    1  2    0.5
2001    1  1    1.0
2003    0  1    0.0

If need ratio to all values add normalize=0:

df = pd.crosstab(my_df['Year'], my_df['gender'], normalize=0).add_prefix('ratio_')
print(df )
gender   ratio_f   ratio_m
Year                      
2000    0.333333  0.666667
2001    0.500000  0.500000
2003    0.000000  1.000000

If need ratio only female:

df = my_df['gender'].eq('f').groupby(my_df['Year']).mean().reset_index(name='ratio')
print(df)
   Year     ratio
0  2000  0.333333
1  2001  0.500000
2  2003  0.000000    
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.