I am trying to apply a function across a number of dataframes using lapply. The function works when I invoke it on each of the dataframes individually, but lapply throws an error. The error doesn't seem relevant. I can't work out what the issue is. Here is an example:
a <- data.frame('country' = factor(c(rep(1, 5), rep(2, 5))),
'variable' = factor(c(rep('A', 5), rep('B', 5))),
'value' = runif(10, 0, 1),
'year' = runif(10, 0, 1))
slope <- function(dat) {
dat %>%
filter(!value %in% c(-66, -77, -88) & !is.na(value)) %>%
group_by(country, variable) %>%
do(data.frame(slope = coef(lm(value ~ year, .))[2])) %>%
ungroup()
}
This function works:
> slope(a)
# A tibble: 2 x 3
country variable slope
<fct> <fct> <dbl>
1 1 A 0.140
2 2 B -0.150
But lapply doesn't:
> lapply(a, slope)
Error in UseMethod("filter_") :
no applicable method for 'filter_' applied to an object of class "factor"
I don't understand the error because value, which is filtered, is numeric (not a factor).
> str(a)
'data.frame': 10 obs. of 4 variables:
$ country : Factor w/ 2 levels "1","2": 1 1 1 1 1 2 2 2 2 2
$ variable: Factor w/ 2 levels "A","B": 1 1 1 1 1 2 2 2 2 2
$ value : num 0.884 0.513 0.835 0.83 0.694 ...
$ year : num 0.4288 0.2874 0.0531 0.7793 0.0496 ...
Obviously when using lapply in practice, I would be using it on a number of dataframes. I don't think it makes a difference in the example - i get the same error when trying to do this on a number of dataframes. I assume I am missing something obvious.
a, it is looping through the columns, i.e. avectorand is not adata.framesplit(a, a$country) %>% lapply(slope)c('a', 'b', 'c'), rather than onlist('a', 'b', 'c')- where 'a', 'b' and 'c' are all dataframes. When I do that it all works. Thanks for this.