R - using regression functions within a group

Question

Suppose I have a dataframe df with three variables df$x, df$y, df$z, and there is a grouping variable df$g.

Usually, to compute a function WITHIN each group, I do the following

df$new<-unlist(tapply(df$x,df$g,FUN=myfunc))

Now suppose I want to generate residuals from regression of x on y and z WITHIN each value of group g, how do I implement it?

More specifically, without using groups, I would have done

df$new<-resid(lm(df$x ~ df$y + df$z, na.action, na.exclude))

One solution to carry out the previous operation WITHIN groups is to use a loop over unique elements of `df$g', but it would be great if there is any vectorized solution.

did you check with ddply from plyr package?

Metrics
– Metrics

2013-08-04 16:42:32 +00:00
Commented Aug 4, 2013 at 16:42 — Metrics
– Metrics, Commented Aug 4, 2013 at 16:42
Check last example in ?by

Henrik
– Henrik

2013-08-04 16:55:37 +00:00
Commented Aug 4, 2013 at 16:55 — Henrik
– Henrik, Commented Aug 4, 2013 at 16:55
This post may be of some help.

Arun
– Arun

2013-08-04 16:58:06 +00:00
Commented Aug 4, 2013 at 16:58 — Arun
– Arun, Commented Aug 4, 2013 at 16:58

Ricardo Saporta · Accepted Answer · 2013-08-04 16:50:22Z

1

In data.table you can use by

library(data.table)
DT <- data.table(df)


DT[, new := resid(lm(x ~ y + z, na.action, na.exclude)), by = g]

edited Aug 4, 2013 at 16:50

answered Aug 4, 2013 at 16:44

Ricardo Saporta

55.5k17 gold badges149 silver badges180 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Metrics · Accepted Answer · 2013-08-04 16:56:44Z

1

library(plyr)
ddply(mydata,.(g),transform, new=resid(lm(x ~ y + z, na.action, na.exclude)))

Test using mtcars data:

mydata<-mtcars

myres<-ddply(mydata,.(carb),transform, new=resid(lm(mpg ~ disp + hp))) # g=carb, x=mpg,y=disp,z=hp
> head(myres)
   mpg cyl  disp  hp drat    wt  qsec vs am gear carb         new
1 22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1  0.20604566
2 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1  2.03023747
3 18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1 -2.39754247
4 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1  1.31212635
5 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1  2.60271481
6 21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1  0.03913515

edited Aug 4, 2013 at 16:56

answered Aug 4, 2013 at 16:44

Metrics

15.5k7 gold badges56 silver badges83 bronze badges

Collectives™ on Stack Overflow

R - using regression functions within a group

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related