3

I just start using R for statistical analysis and I am still learning. I have an issue with creating loops in R. I have the following case and I was wondering if any one can help me with it. For me, it seems impossible but for some of you, it is just a piece of cake. I have a dataset for different firms across different years. for each firm I have different observations for the same year and I need to run the following regression for each firm for each year (I have more than 1000 firms and it seems impossible to run the regression for each firm separately) : Ri = α0 + β1Rm + β2Rz + Ɛ

the data I have looks like the following example:
Year   Firm    Ri    Rm    Rz
2009   A       30    55    85
2009   A       11    55    85
2009   A       1     55    85
2010   A       7     55    85
2010   A       15    55    85
2011   A       20    55    85
2011   A       3.5   55    85
2011   A       8     55    85
2009   B       24    55    85
2009   B       30    55    85
2009   B       25    55    85
2010   B       5.2   55    85
2010   B       11.8  55    85
2011   B       78    55    85
2011   B       90    55    85
2011   B       57    55    85

I need to obtain B1, B2 and the error term Ɛ for each firm for each year. just like this:

Year Firm       B1    B2    Ɛ
2009   A       0.30  0.55  0.85
2010   A       0.11  0.55  0.85
2011   A       0.1   0.55  0.85
2009   B       0.7   0.55  0.85
2010   B       0.15  0.55  0.85
2011   B       0.20  0.55  0.85

Thank you in advance for your help

4
  • I know how to do the lm function, but I don't know how to run this function for each firm for each year using this function and get the results I need. Commented May 13, 2016 at 18:12
  • ?lm and look at subset = Commented May 13, 2016 at 18:14
  • I would use dplyr, something like this. Commented May 13, 2016 at 18:19
  • Use the lmList function from package nlme. Commented May 13, 2016 at 20:22

3 Answers 3

2

You could do this using loops and subset, but you could do also use mapply, like this. (I've made a larger dataset to be able to demonstrate properly).

Year <- sort(rep.int(2009:2011, 30))
Firm <- gl(n = 2, k = 15, length = 90, labels = c('A', 'B'))
dta <- data.frame(Year, Firm, Ri = rnorm(90, 5, 2), Rm = rnorm(90, 2, 1), Rz = rnorm(90, -1, 0.5))

filt <- expand.grid(unique(dta$Year), unique(dta$Firm))

op <- mapply(function(x, y) lm(Ri ~ Rm + Rz, data = dta, subset = Year == x & Firm == y), 
             filt$Var1, filt$Var2, SIMPLIFY = FALSE)

sapply(op,coef)
Sign up to request clarification or add additional context in comments.

Comments

1

Using subset = and two for loops.

for(i in unique(df$Year)) {
  for(j in unique(df$Firm)) {
     print(i)
     print(j)
     print(lm(Ri ~ Rm + Rz, data = df, subset = df$Year==i & df$Firm ==j))
  }
}

Per your new output:

m <- data.frame(matrix(ncol = 5, nrow = length(unique(df$Year))*length(unique(df$Firm))))
l = 0
for(i in unique(df$Year)) {
  for(j in unique(df$Firm)) {
    l = l + 1
    mod<-lm(Ri ~ Rm + Rz, data = df, subset = df$Year==i & df$Firm ==j)
    m[l,] <- c(i,
               as.character(j), 
               mod$coefficients[2],
               mod$coefficients[3],
               summary(mod)$sigma)
  }
}
names(m) <- c("Year", "Firm", "B1", "B2", "e")

Comments

1

You can loop through each Firm and Year to create a unique lm for each like so:

#Assume your data frame is named df
#Convert Firm and Year to factor variables
df$Firm <- as.factor(df$Firm)
df$Year <- as.factor(df$Year)

#Loop through each level in Firm and Year and generate lm for each
for(i in levels(df$Firm)){
  for(j in levels(df$Year)){
    assign(paste0('lm', i, j), lm(Ri~Rm+Rz, data=df[df$Firm==i & df$Y==j,]))
  }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.