Consider this toy data frame:
df <- data.frame(id = c(1, 2),
meandoy = c(3,2),
temp199701 = c(4,2),
temp199702 = c(15,10),
temp199703 = c(-3,7),
temp199704 = c(-1,6),
temp199801 = c(1,5),
temp199802 = c(9,10),
temp199803 = c(-2,2),
temp199804 = c(-5,11))
I want to add a new column with the result of a function for each year and each row. In other words, each new GDDyearcolumn gets the value of calculating from tempyear01to tempyear04.
I can achieve it with this:
sum.GDD <- function(x) sum(x[x > 5]-5, na.rm = TRUE)
yearlist <- c(1997, 1998)
for (year in yearlist){
text <- paste("GDD",toString(year), sep = "")
df[[text]] <- df %>% #store result in this vector
dplyr::select(contains(toString(year))) %>% #take variables that have year
apply(1, sum.GDD) #calculate GDD5 across those variables
}
But there is a twist. I want to apply the function only to the number of columns specified in meandoy each year.
For example, GDD1997 in the first row will be the result of calculating the first 3 columns starting from temp199701, because meandoy = 3. GDD1998 will get the result from temp199801, temp199802 and temp199803.
In the second row the meandoy = 2 so the result of GDD1997 will be calculated from temp199701 and temp199702. GDD1998 from temp199801 and temp199802.