0

Suppose I have a data frame in the environment, mydata, with three columns, A, B, C.

mydata = data.frame(A=c(1,2,3),
                    B=c(4,5,6),
                    C=c(7,8,9))

I can create a linear model with

lm(C ~ A, data=mydata)

I want a function to generalize this, to regress B or C on A, given just the name of the column, i.e.,

f = function(x){
  lm(x ~ A, data=mydata)
}
f(B)
f(C)

or

g = function(x){
  lm(mydata$x ~ mydata$A)
}
g(B)
g(C)

These solutions don't work. I know there is something wrong with the evaluation, and I have tried permutations of quo() and enquo() and !!, but no success.

This is a simplified example, but the idea is, when I have dozens of similar models to build, each fairly complicated, with only one variable changing, I want to do so without repeating the entire formula each time.

3 Answers 3

3

If we want to pass unquoted column name, and option is {{}} from tidyverse. With select, it can take both string and unquoted

library(dplyr)
printcol2 <- function(data, x) {
                    data %>%
                      select({{x}})
      }

printcol2(mydata, A)
#  A
#1 1
#2 2
#3 3
printcol2(mydata, 'A')
#  A
#1 1
#2 2
#3 3

If the OP wanted to pass unquoted column name to be passed in lm

f1 <- function(x){
    rsp <- deparse(substitute(x))
    fmla <- reformulate("A", response = rsp)
    out <- lm(fmla, data=mydata)
    out$call <- as.symbol(paste0("lm(", deparse(fmla), ", data = mydata)"))
    out
   }

f1(B)

#Call:
#lm(B ~ A, data = mydata)

#Coefficients:
#(Intercept)            A  
#          3            1  

f1(C)

#Call:
#lm(C ~ A, data = mydata)

#Coefficients:
#(Intercept)            A  
#          6            1  
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks akrun. You solved the problem, but not the way I needed, so I edited the question.
2

Maybe you are looking for deparse(substitute(.)). It accepts arguments quoted or not quoted.

f = function(x, data = mydata){
  y <- deparse(substitute(x))
  fmla <- paste(y, 'Species', sep = '~')
  lm(as.formula(fmla), data = data)
}

mydata <- iris
f(Sepal.Length)
#
#Call:
#lm(formula = as.formula(fmla), data = data)
#
#Coefficients:
#      (Intercept)  Speciesversicolor   Speciesvirginica  
#            5.006              0.930              1.582  

f(Petal.Width)
#
#Call:
#lm(formula = as.formula(fmla), data = data)
#
#Coefficients:
#      (Intercept)  Speciesversicolor   Speciesvirginica  
#            0.246              1.080              1.780

Comments

1

I think generally, you might be looking for:

printcol <- function(x){
  print(x)
}

printcol(mydata$A)

This doesn't involve any fancy evaluation, you just need to specify the variable you'd like to subset in your function call.

This gives us:

[1] 1 2 3

Note that you're only printing the vector A, and not actually subsetting column A from mydata.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.