I would like to predict values from a linear regression from multiple groups in a single dataframe. I have found the following blogpost which ALMOST does everything I need: https://www.r-bloggers.com/2016/09/running-a-model-on-separate-groups/
However, I cannot combine this with the predict() function with a newdata. For one group, I use the following:
m <- lm(y ~ x, df)
new_df <- data.frame(x=c(5))
predict(m, new_df)
this gives me the predicted value for y at x=5.
How do I do this when I have multiple groups in my df? This is what I tried:
df %>%
nest(-group) %>%
mutate(fit = map(data, ~ lm(.$y ~ .$x)),
results = map(fit, predict)) %>%
unnest(results)
When I try to use results = map(fit, predict(new_df)), I only get an error. Is there a way how I can pass my value for x (in this case 5) into the code above?
Ideally, I would get a new data.frame with two columns, group and the predicted y-value.
This is a sample data.frame:
group x y
g1 1 2
g1 1.5 3
g1 2 4
g1 2.3 4.4
g1 3 6
g1 3.4 6.2
g1 4.11 7
g1 4.8 7.9
g1 5 8
g1 5.3 8.2
g2 2 5
g2 2.3 4
g2 4 2.2
g2 4.4 1.9
g2 7 0.3
EDIT:
Plotting the sample data using ggplot2, I get the following plot:
ggplot(df, aes(x,y,colour=group)) +
geom_point() +
stat_smooth(method="lm", se=FALSE)
Using the following code, I get the sought after predicted y-values:
predict(lm(y ~ x, df[df$group =="g1", ]), new_df)
1
8.180285
predict(lm(y ~ x, df[df$group =="g2", ]), new_df)
1
1.732136
I would like to generate a new dataframe which should look something like this and contain the predicted y-value at x=5:
group y_predict
g1 8.180285
g2 1.732136
