0

I have sample data.set containing climate data for different seasons:

df <- data.frame(season=rep(1:5,2),year=rep(1:2,each=5),
      temp=c(2,4,3,5,2,4,1,5,4,3),ppt=c(4,3,1,5,6,2,1,2,2,2),
      samples=c(22,25,24,31,31,29,28,31,30,32))

I can determine the mean of my climate variables for each season for each year simply:

aggregate(df[,c('temp','ppt')], by = list(df$season,df$year), function(x) mean(x,na.rm=T))

However, I want to determine the weighted mean of each season|year combo using variable samples as my weights.

Essentially I want to replace my mean function in aggregate() with weighted.mean. That would require adding a second argument to my function that needs to change with my x.

    function(x,w) weighted.mean(x,w,na.rm=T))

Though, I'm not sure how to let the weight argument ('w') of weighted.mean() vary with each subset of the aggregated data.

Can I do this all within an aggregate function?

Any advice would be great!

0

1 Answer 1

3

Try summarise_each from dplyr. It allows for the prior grouping with group_by and application to multiple columns:

library(dplyr)
df %>% group_by(season, year) %>%
        summarise_each(funs(weighted.mean(., samples,na.rm=T)), temp,ppt)
# Source: local data frame [10 x 5]
# Groups: season, year [10]
# 
#    season  year  temp   ppt samples
#    (int) (int) (dbl) (dbl)   (dbl)
# 1       1     1     2     4      22
# 2       2     1     4     3      25
# 3       3     1     3     1      24
# 4       4     1     5     5      31
# 5       5     1     2     6      31
# 6       1     2     4     2      29
# 7       2     2     1     1      28
# 8       3     2     5     2      31
# 9       4     2     4     2      30
# 10      5     2     3     2      32
Sign up to request clarification or add additional context in comments.

2 Comments

Can this be done using aggregate or any other function in the base package of R?
I have no idea why you want a complicated base solution when this works, but here you go. The explanation would take very long to go through df[,c("temp", "ppt")] <- matrix(ncol=2, unlist(do.call(rbind, lapply(split(df, list(df$season, df$year)),function(df) { lapply(df[,c("temp", "ppt")], function(cols) weighted.mean(cols, df$samples, na.rm=T))}))))

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.