1

I would like to add the regression line equation and r squared value to my ggplot2 scatter plot.

I have found a similar question, which gives the code below, but it doesn't work when I force the regression through the intercept:

library(devtools)
source_gist("524eade46135f6348140")
df = data.frame(x = c(1:100))
df$y = 2 + 5 * df$x + rnorm(100, sd = 40)
ggplot(data = df, aes(x = x, y = y, label=y)) +
  stat_smooth_func(geom="text",method="lm",hjust=0,parse=TRUE, formula=y~x-1) +
  geom_smooth(method="lm",se=FALSE, formula=y~x-1) +
  geom_point()

By adding formula=y~x-1, the text displayed shows the coefficient as the intercept, with the intercept as NA. Is there a fix for this?

3
  • 2
    I'm not sourcing some unknown gist. If you found the code in a SO question, link that question. Even better, simply provide the source code of stat_smooth_func in your question. Commented Mar 11, 2016 at 13:46
  • stackoverflow.com/questions/7549694/… Commented Mar 11, 2016 at 13:53
  • The above link is where I found the code referenced in the question Commented Mar 11, 2016 at 13:57

2 Answers 2

8

An option is geom_smooth(method="lm",formula=y~0+x).

Sign up to request clarification or add additional context in comments.

Comments

4

In this simple case (without facetting or grouping), you don't need to create a new stat_*. You can simply do this:

fit <- lm(y ~ x - 1, data = df)
ggplot(data = df, aes(x = x, y = y, label=y)) +
  stat_function(fun = function(x) predict(fit, newdata = data.frame(x = x)),
                color = "blue", size = 1.5) +
  annotate(label = sprintf("y = %.3f x\nR² = %.2f", coef(fit), summary(fit)$r.squared),
           geom = "text", x = 25, y = 400, size = 12) +
  geom_point()

resulting plot

Of course, the stat_* function from the gist would be easy to adjust for regression through the origin.

Off-topic comment: It's very rare that regression without intercept is sensible from the statistics point of view.

1 Comment

In the context of my data, a value of 0 for one variable, would mean the other has to be 0. In any case, the r squared value is markedly improved when regressing through the origin for my data. Thank you for your answer - it's done the trick!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.