4

My question is essentially the same as this question: data.table join then add columns to existing data.frame without re-copy.

Basically I have a template with keys and I want to assign columns from other data.tables to the template by the same keys.

> template
    id1 id2
 1:   a   1
 2:   a   2
 3:   a   3
 4:   a   4
 5:   a   5
 6:   b   1
 7:   b   2
 8:   b   3
 9:   b   4
10:   b   5
> x
   id1 id2       value
1:   a   2  0.01649728
2:   a   3 -0.27918482
3:   b   3  0.86933718
> y
   id1 id2     value
1:   a   4 -1.163439
2:   b   4  2.267872
3:   b   5  1.083258
> template[x, value := i.value]
> template[y, value := i.value]
> template
    id1 id2       value
 1:   a   1          NA
 2:   a   2  0.01649728
 3:   a   3 -0.27918482
 4:   a   4 -1.16343917
 5:   a   5          NA
 6:   b   1          NA
 7:   b   2          NA
 8:   b   3  0.86933718
 9:   b   4  2.26787248
10:   b   5  1.08325793
> 

But if x and y have say 100 columns, then it is not possible to write out the value := i.value syntax for all columns. Is there a way to do the same thing but for all the columns in x and y?

EDIT: If I do y[x[template]], then it creates separate value columns, which is not intended:

> y[x[template]]
    id1 id2     value     value.1
 1:   a   1        NA          NA
 2:   a   2        NA  0.01649728
 3:   a   3        NA -0.27918482
 4:   a   4 -1.163439          NA
 5:   a   5        NA          NA
 6:   b   1        NA          NA
 7:   b   2        NA          NA
 8:   b   3        NA  0.86933718
 9:   b   4  2.267872          NA
10:   b   5  1.083258          NA
> 
4
  • Yes, but I want to assign the columns to template. Essentially I want to populate the template with many data.tables like x. For example, x will contain values for some keys and y will contain values for some other keys. So template <- x[template] will not work. Commented Mar 31, 2014 at 17:17
  • @Arun: I added some example to hopefully clarify my case. Commented Mar 31, 2014 at 17:34
  • Great, now I see what you mean. How about this post? You can construct a similar expression and just eval it each time. Commented Mar 31, 2014 at 17:44
  • I think that post will work. I was hoping there would be a more elegant syntax. Thanks. Commented Mar 31, 2014 at 17:55

1 Answer 1

5

Just create a function that takes names as arguments and constructs the expression for you. And then eval it each time by passing the names of each data.table you require. Here's an illustration:

get_expr <- function(x) {
    # 'x' is the names vector
    expr = paste0("i.", x)
    expr = lapply(expr, as.name)
    setattr(expr, 'names', x)
    as.call(c(quote(`:=`), expr))
}

> get_expr('value')    ## generates the required expression
# `:=`(value = i.value)

template[x, eval(get_expr("value"))]
template[y, eval(get_expr("value"))]

#     id1 id2       value
#  1:   a   1          NA
#  2:   a   2  0.01649728
#  3:   a   3 -0.27918482
#  4:   a   4 -1.16343900
#  5:   a   5          NA
#  6:   b   1          NA
#  7:   b   2          NA
#  8:   b   3  0.86933718
#  9:   b   4  2.26787200
# 10:   b   5  1.08325800
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.