Here's a data.table
dt <- data.table(group = c("a","a","a","b","b","b"), x = c(1,3,5,1,3,5), y= c(3,5,8,2,8,9))
dt
group x y
1: a 1 3
2: a 3 5
3: a 5 8
4: b 1 2
5: b 3 8
6: b 5 9
And here's a function that operates on a data.table and returns a data.table
myfunc <- function(dt){
# Hyman spline interpolation (which preserves monotonicity)
newdt <- data.table(x = seq(min(dt$x), max(dt$x)))
newdt$y <- spline(x = dt$x, y = dt$y, xout = newdt$x, method = "hyman")$y
return(newdt)
}
How do I apply myfunc to each subset of dt defined by the "group" column? In other words, I want an efficient, generalized way to do this
result <- rbind(myfunc(dt[group=="a"]), myfunc(dt[group=="b"]))
result
x y
1: 1 3.000
2: 2 3.875
3: 3 5.000
4: 4 6.375
5: 5 8.000
6: 1 2.000
7: 2 5.688
8: 3 8.000
9: 4 8.875
10: 5 9.000
EDIT: I've updated my sample dataset and myfunc because I think it was initially too simplistic and invited work-arounds to the actual problem I'm trying to solve.
dt[, .(x = seq(min(x), max(x) + 1), y = rep(y, each = 2)), by = group]myfunc <- function(x, y){ list(x = seq(min(x), max(x)+1), y = rep(y, each=2))}and then dodt[, myfunc(x, y), by = group]dt[, myfunc(x, y), by = group].