2

I have a function that inputs a data.frame and outputs the residual version of it with some chosen variable as predictor.

residuals.DF = function(data, resid.var, suffix="") {
  lm_f = function(x) {
    x = residuals(lm(data=data, formula= x ~ eval(parse(text=resid.var))))
  }
  resid = data.frame(apply(data,2,lm_f))
  colnames(resid) = paste0(colnames(data),suffix)
  return(resid)
}

set.seed(31233)
df = data.frame(Age = c(1,3,6,7,3,8,4,3,2,6),
                Var1 = c(19,45,76,34,83,34,85,34,27,32),
                Var2 = round(rnorm(10)*100))

df.res = residuals.DF(df, "Age", ".test")
df.res
        Age.test   Var1.test  Var2.test
1  -1.696753e-17 -25.1351351  -90.20582
2  -1.318443e-19  -0.8108108   31.91892
3  -5.397735e-18  27.6756757   84.10603
4  -5.927747e-18 -15.1621622 -105.83160
5  -3.807699e-18  37.1891892  -57.08108
6  -6.457759e-18 -16.0000000  -25.76923
7   5.117344e-17  38.3513514  -65.01871
8  -3.807699e-18 -11.8108108   35.91892
9  -3.277687e-18 -17.9729730   97.85655
10 -5.397735e-18 -16.3243243   94.10603

This works fine, however, I often need to use the eval parse combo when working with variable inputs to lm(), so I decided to write a wrapper function:

#Wrapper function for convenience for evaluating strings
evalparse = function(string) {
  eval(parse(text=string))
}

This works fine when used alone, e.g.:

> evalparse("5+5")
[1] 10

However, if one uses it in the above function, one gets:

> df.res = residuals.DF(df, "Age", ".test")
Error in eval(expr, envir, enclos) : object 'Age' not found 

I figure this is because the wrapper function means that the string gets evaluated in its own environment where the chosen variable is missing. This does not happen when using eval parse combo because it then happens in the lm() environment where the chosen variable is not missing.

Is there some clever solution to this problem? A better way of using dynamic formulas in lm()? Otherwise I will have to keep typing eval(parse(text=object)).

4
  • Have you tried mget() in place of eval(parse())? Commented Apr 3, 2015 at 17:25
  • get() worked in the above example, mget() didn't (returned wrong type list). Commented Apr 3, 2015 at 17:46
  • Ah yeah, sorry, I meant get(). You could use mget() but you'd need mget()[[1]] to get the item from the list. If get() works for you I'll post it as an answer rather than a comment. Commented Apr 3, 2015 at 17:51
  • If x and y are the names of two columns in data frame DF then this regresses y on x (with an intercept): lm(DF[c(y, x)]) without using parse, eval, formulas, etc. Commented Apr 3, 2015 at 20:17

1 Answer 1

4

Anytime you're trying to perform operations that modify the contents of a formula, you should use update because it is designed for this purpose.

In your case, you want to modify your function as follows:

residuals.DF = function(data, resid.var, suffix="") {
  lm_f = function(x) {
    x = residuals(lm(data=data, formula= update(x ~ 0, paste0("~",resid.var))))
  }
  resid = data.frame(apply(data,2,lm_f))
  colnames(resid) = paste0(colnames(data),suffix)
  return(resid)
}

Basically, update (or the update.formula method specifically) takes a formula as its first argument, and then allows for modifications based on its second argument. To get a handle on it, check out the following examples:

f <- y ~ x
f
# y ~ x
update(f, ~ z)
# y ~ z
update(f, x ~ y)
# x ~ y
update(f, "~ x + y")
# y ~ x + y
update(f, ~ . + z + w)
# y ~ x + z + w
x <- "x"
update(f, paste0("~",x))
# y ~ x

As you can see, the second argument can be a formula or character string containing one or more variables. This greatly simplifies the creation of a dynamically modified formula where you are only trying to change one part of the formula.

Sign up to request clarification or add additional context in comments.

2 Comments

That is neat, yes. Someone posted a solution based on as.formula() before, but decided to delete it apparently. Goes like this: x = residuals(lm(data=data, formula= as.formula(paste0("x ~ ",resid.var, collapse=""))))
Also, don't forget ?reformulate for these kinds of tasks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.