1

If I have a data table, foo, in R with a column named "date", I can get the vector of date values by the notation

foo[, date]

(Unlike data frames, date doesn't need to be in quotes).

How can this be done programmatically? That is, if I have a variable x whose value is the string "date", then how to I access the column of foo with that name?

Something that sort of works is to create a symbol:

sym <- as.name(x)
v <- foo[, eval(sym)]

...

As I say, that sort of works, but there is something not quite right about it. If that code is inside a function myFun in package myPackage, then it seems that it doesn't work if I explicitly use the package through:

myPackage::myFun(...)

I get an error message saying "undefined columns selected".

[edited] Some more details

Suppose I create a package called myPackage. This package has a single file with the following in it:

library(data.table)
#' export
myFun <- function(table1) {
    names1 <- names(table1)
    name1 <- names1[[1]]
    sym <- as.Name(name1)
    table1[, eval(sym)]
}

If I load that function using R Studio, then

myFun(tbl)

returns the first column of the data table tbl.

On the other hand, if I call

myPackage::myFun(tbl)

it doesn't work. It complains about

Error in .subset(x, j) : invalid subscript type 'builtin'

I'm just curious as to why myPackage:: would make this difference.

9
  • 2
    Try x <- "date" ; foo[, x, with = FALSE], or just foo[[x]] Commented Sep 10, 2014 at 23:01
  • Thank you! Just because I'm curious, do you have any idea why my foo[, eval(sym)] would work in some cases, but not others? It seems that I get different behavior if I call myFun(...) versus myPackage::myFun. I'm guessing that the :: screws up the namespace for symbols? Commented Sep 10, 2014 at 23:08
  • That I can't help you with, sorry. I can bring attention to this question from one of data.table authors if you want, but I doubt he would be able to help with so little information. Either way, no reason for using neither as.name or eval. You can evaluate names within data.table just by putting them into () Commented Sep 10, 2014 at 23:10
  • Without knowing what else is done in "myFun" and "myPackage" (in terms of creating a data.table aware package), the second part of the question is unanswerable. Commented Sep 10, 2014 at 23:14
  • I'm not used to StackOverflow. It would be enormously helpful if there was a preview function. Commented Sep 10, 2014 at 23:47

2 Answers 2

1

A quick way which points to a longer way is this:

subset(foo, TRUE, date)

The subset function accepts unquoted symbol/names for its 'subset' and 'select' arguments. (Its author, however, thinks this was a bad idea and suggests we use formulas instead.) This was the jumping off place for sections of Hadley Wickham's Advanced Programming webpages (and book).: http://adv-r.had.co.nz/Computing-on-the-language.html and http://adv-r.had.co.nz/Functional-programming.html . You can also look at the code for subset.data.frame:

> subset.data.frame
function (x, subset, select, drop = FALSE, ...) 
{
    r <- if (missing(subset)) 
        rep_len(TRUE, nrow(x))
    else {
        e <- substitute(subset)
        r <- eval(e, x, parent.frame())
        if (!is.logical(r)) 
            stop("'subset' must be logical")
        r & !is.na(r)
    }
    vars <- if (missing(select)) 
        TRUE
    else {
        nl <- as.list(seq_along(x))
        names(nl) <- names(x)
        eval(substitute(select), nl, parent.frame())
    }
    x[r, vars, drop = drop]
}

The problem with the use of "naked" expressions that get passed into functions is that their evaluation frame is sometimes not what is expected. R formulas, like other functions, carry a pointer to the environment in which they were defined.

Sign up to request clarification or add additional context in comments.

Comments

1

I think the problem is that you've defined myFun in your global environment, so it only appeared to work.

I changed as.Name to as.name, and created a package with the following functions:

library(data.table)
myFun <- function(table1) {
    names1 <- names(table1)
    name1 <- names1[[1]]
    sym <- as.name(name1)
    table1[, eval(sym)]
}
myFun_mod <- function(dt) {
    # dt[, eval(as.name(colnames(dt)[1]))]
    dt[[colnames(dt)[1]]]
}

Then, I tested it using this:

library(data.table)
myDt <- data.table(a=letters[1:3],b=1:3)
myFun(myDt)
myFun_mod(myDt)

myFun didn't work myFun_mod did work

The output:

> library(test)
> myFun(myDt)
Error in eval(expr, envir, enclos) : object 'a' not found
> myFun_mod(myDt)
[1] "a" "b" "c"

then I added the following line to the NAMESPACE file: import(data.table)

This is what @mnel was talking about with this link: Using data.table package inside my own package

After adding import(data.table), both functions work.

I'm still not sure why you got the particular .subset error, which is why I went though the effort of reproducing the result...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.