1

Suppose I have the following function that takes in the parameter var_name. var_name refers to the name of a variable in the data frame. Now consider the following function:

library(dplyr) 
calculate_mean <- function(data, var_name) {
    lapply(select(data, var_name), mean, na.rm=TRUE)
}

However, I am getting the error:

Error: All select() inputs must resolve to integer column positions.
    The following do not: *  var_name

1 Answer 1

2
df <- head(iris)

f <- function(data, var_name) {
  select(data, var_name)
}

f(df, "Petal.Width")
#Error: All select() inputs must resolve to integer column positions.
#The following do not:
#*  var_name

The author of that package tends to write optional versions of functions that accept character strings as arguments. Try adding an underscore to the function:

f2 <- function(data, var_name) {
  select_(data, var_name)
}

f2(df, "Petal.Width")
#  Petal.Width
#1         0.2
#2         0.2
#3         0.2
#4         0.2
#5         0.2
#6         0.4

Further Explanation Usually an unquoted string is considered a variable. If we try x in the console, the evaluator will search the environment for a variable with that name. When used with a function the same search will occur. With mean(x) the variable x must be defined.

This behavior can become confusing when the function is written to not search for a variable. It is called, non-standard evaluation, NSE. There is a base R function that uses NSE. subset(df, select= -Petal.Width) returns the data frame without Petal.Width. This convenience makes for easier programming. select was designed in a similar way.

When you created your function it evaluated in a standard way; unquoted arguments were considered variables. But you are using it for an NSE function select. That function will look for var_name even though you were expecting it to be replaced by the user's input. Let's demonstrate the behavior by creating a literal var_name column:

df$var_name <- 1
f(df, "Petal.Width")
  var_name
1        1
2        1
3        1
4        1
5        1
6        1

The original function with select returned the column var_name, not the column we hoped for. Hadley Wickham created select_ in part, to anticipate this discrepancy.

For more information on NSE http://adv-r.had.co.nz/Computing-on-the-language.html

Sign up to request clarification or add additional context in comments.

2 Comments

What's the difference between select() and select_() ?
@Neel - One uses non-standard evaluation (select()), the other standard (select_()).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.