1

Given two data.table:

dt1 <- data.table(id = c(1,-99,2,2,-99), a = c(2,1,-99,-99,3), b = c(5,3,3,2,5), c = c(-99,-99,-99,2,5))
dt2 <- data.table(id = c(2,3,1,4,3),a = c(6,4,3,2,6), b = c(3,7,8,8,3), c = c(2,2,4,3,2))

> dt1
    id   a b   c
1:   1   2 5 -99
2: -99   1 3 -99
3:   2 -99 3 -99
4:   2 -99 2   2
5: -99   3 5   5

> dt2
   id a b c
1:  2 6 3 2
2:  3 4 7 2
3:  1 3 8 4
4:  4 2 8 3
5:  3 6 3 2

How can one replace the -99 of dt1 with the values of dt2?

Wanted results should be dt3:

> dt3
   id a b c
1:  1 2 5 2
2:  3 1 3 2
3:  2 3 3 4
4:  2 2 2 2
5:  3 3 5 5
1

5 Answers 5

3

You can do the following:

dt3 <- as.data.frame(dt1)
dt2 <- as.data.frame(dt2)
dt3[dt3 == -99] <- dt2[dt3 == -99]
dt3

#   id a b c
# 1  1 2 5 2
# 2  3 1 3 2
# 3  2 3 3 4
# 4  2 2 2 2
# 5  3 3 5 5
Sign up to request clarification or add additional context in comments.

Comments

3

If your data is all of the same type (as in your example) then transforming them to matrix is a lot faster and transparent:

dt1a <- as.matrix(dt1)  ## convert to matrix
dt2a <- as.matrix(dt2)

# make a matrix of the same shape to access the right entries
missing_idx <- dt1a == -99  
dt1a[missing_idx] <- dt2a[missing_idx]  ## replace by reference

This is a vectorized operation, so it should be fast.

Note: If you do this make sure the two data sources match exactly in shape and order of rows/columns. If they don't then you need to join by the relevant keys and pick the correct columns.

EDIT: The conversion to matrix may be unnecessary. See kath's answer for a more terse solution.

Comments

2

Simple way could be to use setDF function to convert to data.frame and use data frame sub-setting methods. Restore to data.table at the end.

#Change to data.frmae
setDF(dt1)
setDF(dt2)

# Perform assignment 
dt1[dt1==-99] = dt2[dt1==-99]

# Restore back to data.table    
setDT(dt1)
setDT(dt2)

dt1
#   id a b c
# 1  1 2 5 2
# 2  3 1 3 2
# 3  2 3 3 4
# 4  2 2 2 2
# 5  3 3 5 5

Comments

2

This simple trick would work efficiently.

dt1<-as.matrix(dt1)
dt2<-as.matrix(dt2)

index.replace = dt1==-99
dt1[index.replace] = dt2[index.replace]

as.data.table(dt1)
as.data.table(dt2)

Comments

1

This should work, a simple approach:

  for (i in 1:nrow(dt1)){
    for (j in 1:ncol(dt1)){
    if (dt1[i,j] == -99) dt1[i,j] = dt2[i,j]
    }
    }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.