0

What I want to do is to perform a regression loop that always has the same predictor but loops over responses (here: y1, y2 and y3). The problem is that I want it also to be done for each category of a grouping variable. In the example data below, I want to make the regression y_i=x for all three y variables, which would result in three regressions. But I want this to be done separately for group=a, group=b and group=c, resulting in 9 different regressions (preferably stored as lists). Cant figure out how to do it! Anyone who has an idea on how to do this?

My idea so far was to maybe do a for-loop or lapply combined with dplyr::group_by, but can't get it to work.

Example data (I have a much larger data set for the actual analysis).

set.seed(123)
dat <- data.frame(group=c(rep("a",10), rep("b",10), rep("c",10)), 
                   x=rnorm(30), y1=rnorm(30), y2=rnorm(30), y3=rnorm(30))

1 Answer 1

2

1) Use lmList in nlme (which comes with R so you don't have to install it).

library(nlme)
regs <- lmList(cbind(y1, y2, y3) ~ x | group, dat)

giving an lmList object having a component for each group. We show the component for group a and the other groups are similar.

> regs$a

Call:
lm(formula = object, data = dat, na.action = na.action)

Coefficients:
             y1       y2       y3     
(Intercept)   0.2943   0.1395   0.4539
x             0.3721  -0.2206  -0.2255

2) Another approach is to perform one overall lm giving an lm object having the same coefficients as above.

lm(cbind(y1, y2, y3) ~ group + x:group + 0, dat)

3) We could also use one of several list comprehension packages. This gives a list of 9 components. The names of the components identify the combination used as does the call component (shown in the Call: line of the output) within each main component. Note t hat the current CRAN version is 0.1.0 but the code below relies on listcompr 0.1.1 which can be obtained from github until it is put on CRAN.

# install.github("patrickroocks/listcompr")
library(listcompr)
packageVersion("listcompr") # need version 0.1.1 or later

regs <- gen.named.list("{y}.{g}",
  do.call("lm", 
    list(reformulate("x", y), quote(dat), subset = bquote(dat$group == .(g)))
  ), y = c("y1", "y2", "y3"), g = unique(dat$group)
)

If you don't mind that the Call: line in the output is less descriptive then it can be simplified to:

gen.named.list("{y}.{g}", lm(reformulate("x", y), dat, subset = group == g),
  y = c("y1", "y2", "y3"), g = unique(dat$group))

Note

The input corrected from question which had two y2's.

set.seed(123)
dat <- data.frame(group=c(rep("a",10), rep("b",10), rep("c",10)), 
                   x=rnorm(30), y1=rnorm(30), y2=rnorm(30), y3=rnorm(30))
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks! went for the first solution, worked great. Updated the question so that there is not two y2:s
Now I have tried the third solution but I get the error message: "Error: could not evaluate variable range of 'y', got 'c("y1", "y2", "y3")', expected something like start_expr:end_expr, did not find numeric vector". What could be wrong?
Maybe it's the listcompr version. I am using the development version of listcompr from github. which is version 0.1.1. Try that and see if it makes a difference since it works for me.
I just checked the NEWS in the development version (0.1.1) and prior to that version variable ranges could only be numeric so that would account for what you experienced. Just install the version from github. I will update the answer to point this out.
Thanks! Updated to the github-version and it worked perfectly!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.