1

I am trying to perform an update join of two data tables with the fields (more than one) I need to use to join stored in a variable. Below is an example:

library(data.table)
DT1 <- data.table(col1 = 1:5, col2 = 5:1, lett = letters[1:5])
DT2 <- data.table(col1 = c(1:3, 2:5, 1), col2 = c(5:3, 4:1, 5))

joinFields <- c('col1', 'col2')

I tried doing it this way:

DT1[DT2,
    on=c(paste0(joinFields, '=', joinFields)),
    nomatch=0L]

This way is based on a solution suggested in Join datatables using column names stored in variables.

dt1[dt2_temp, 
    on=c(paste0(varName, ">valueMin"), paste0(varName, "<=valueMax")),
    nomatch=0L]

It does not work. Obviously, my case is a bit different, because in the example I used, there are 2 pastes. Is there a solution that continues to allow me using on = c()?

Edit: I am aware I can do it with merge()`

4
  • 2
    You mention "update join" but don't use :=. Could you show and explain the desired output? Btw, you might need "==" not "=" there, I guess, eg DT1[DT2, on=sprintf("%s==%s", joinFields, joinFields)] Commented Mar 6, 2019 at 16:40
  • 2
    the keys used to join appears to be the same in both tables so why not DT1[DT2, on=joinFields, nomatch=0L] ? Commented Mar 7, 2019 at 0:47
  • 1
    @Frank, thanks for your suggestion. It is indeed not an update join. I was a bit lazy and copied from the other thread. I should have added let := i.lett Commented Mar 7, 2019 at 14:01
  • @chinsoon12, thanks that is a very short solution Commented Mar 7, 2019 at 14:01

1 Answer 1

4

I think you just need to put two ==, as following:

DT1[DT2,
    on=c(paste0(joinFields, '==', joinFields)),
    nomatch=0L]
# col1 col2 lett
# 1:    1    5    a
# 2:    2    4    b
# 3:    3    3    c
# 4:    2    4    b
# 5:    3    3    c
# 6:    4    2    d
# 7:    5    1    e
# 8:    1    5    a

Even you do not need to use c() :

 DT1[DT2,
        on=paste0(joinFields, '==', joinFields),
        nomatch=0L]
    # col1 col2 lett
    # 1:    1    5    a
    # 2:    2    4    b
    # 3:    3    3    c
    # 4:    2    4    b
    # 5:    3    3    c
    # 6:    4    2    d
    # 7:    5    1    e
    # 8:    1    5    a
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you very much @Carles Sans Fuentes. I must say I have some things to look into, because I don't understand why it works with the == operator. I generally join like this DT1[DT2, on = .(col1 = col1, col2 = col2), nomatch=0L]
@koteletje It's just the documented behavior with strings in on= (from ?data.table)
Happy to help @koteletje. We all need to be checking staff. Regarding your question of why it is two equals: besides being stated like that on the data.table documentation, the standard way to check if an object/something is exactly equal than another object/something is using logical operators, and it is stablished as two equals . Check this link for further understanding of it: statmethods.net/management/operators.html

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.