How to group by multiple columns in dataframe using R and do aggregate function

Question

I have a dataframe with columns as defined below. I have provided one set of example, similar to this I have many countries with loan amount and gender variables

      country          loan_amount          gender 
1      Austia              175                F        
2      Austia              100                F        
3      Austia              825                M        
4      Austia              175                F        
5      Austia             1025                M        
6      Austia              225                F

Here I need to group by countries and then for each country, I need to calculate loan percentage by gender in new columns, so that new columns will have male percentage of total loan amount for that country and female percentage of total loan amount for that country. I need to do two group_by function, first to group all countries together and after that group genders to calculate loan percent.

    Total loan amount = 2525
     female_prcent = 175+100+175+225/2525 = 26.73
     male_percent = 825+1025/2525 = 73.26

The output should be as below:

      country            female_percent   male_percent 
1      Austia              26.73            73.26

I am trying to do this in R. I tried the below function, but my R session is not producing any result and it is terminating.

  df %>%
  group_by(country, gender) %>%
  summarise_each(funs(sum))

Could someone help me in achieving this output? I think this can be achieved using dplyr function, but I am struck inbetween.

As there is only one column, you can use summarise instead of summarise_each` — akrun
– akrun, Commented Jun 5, 2016 at 14:51
Yes, it should work, but I am not sure what you meant by terminating? — akrun
– akrun, Commented Jun 5, 2016 at 14:54
R studio is getting aborted. I tried summarise option, but it is not helping out. — SRS
– SRS, Commented Jun 5, 2016 at 14:57
As @akrun said, you can just use summarise. E.g.: d1 %>% group_by(country, gender) %>% summarise(amt = sum(loan_amount)) %>% transmute(gender = gender, perc = amt/sum(amt)) will give the desired result in long format. — Jaap
– Jaap, Commented Jun 5, 2016 at 15:02

akuiper · Accepted Answer · 2016-06-05 15:12:10Z

1

We can try the weighted table from questionr package:

library(questionr)
with(df, wtd.table(country, gender, weights = round(100 * loan_amount/sum(loan_amount), 2)))

           F     M
Austia 26.73 73.26

answered Jun 5, 2016 at 15:12

akuiper

216k33 gold badges363 silver badges380 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to group by multiple columns in dataframe using R and do aggregate function

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related