I have two data.tables "commuters_old" and "commuter_new":
commuters_old: (1.703.000 observations)
| from1 | to1 | amount1 |
|---|---|---|
| 1001 | 1001 | 12 |
| 1001 | 1002 | 2 |
| 1001 | 1003 | 23 |
| 2000 | 1002 | 200 |
| 2001 | 2001 | 80 |
| 2001 | 2002 | 98 |
commuters_new: (1.400.000 observations)
| from2 | to2 | amount2 |
|---|---|---|
| 1000 | 1001 | 10 |
| 1001 | 1002 | 22 |
| 1001 | 1004 | 2 |
| 2000 | 1002 | 50 |
| 2001 | 2001 | 8 |
| 2001 | 2003 | 100 |
library(data.table)
commuters_old <- data.table(
from1 = c(1001L, 1001L, 1001L, 2000L, 2001L, 2001L),
to1 = c(1001L, 1002L, 1003L, 1002L, 2001L, 2002L),
amount1 = c(12L, 2L, 23L, 200L, 80L, 98L))
commuters_new <- data.table(
from2 = c(1000L, 1001L, 1001L, 2000L, 2001L, 2001L),
to2 = c(1001L, 1002L, 1004L, 1002L, 2001L, 2003L),
amount2 = c(10L, 22L, 2L, 50L, 8L, 100L))
I would like to add the column "amount1" to data.table "commuters_new" but only if "from1 == from2" AND "to1 == to2". If this condition is not met, I want to write 0, so that:
commuters_new:
| from2 | to2 | amount2 | amount1 |
|---|---|---|---|
| 1000 | 1001 | 10 | 0 |
| 1001 | 1002 | 22 | 2 |
| 1001 | 1004 | 2 | 0 |
| 2000 | 1002 | 50 | 200 |
| 2001 | 2001 | 8 | 80 |
| 2001 | 2003 | 100 | 0 |
After that merge, I want to add another column "factor" that gives me the factor of "amount2" and"amount1".
commuters_new:
| from2 | to2 | amount2 | amount1 | factor |
|---|---|---|---|---|
| 1000 | 1001 | 10 | 0 | 0 |
| 1001 | 1002 | 22 | 2 | 11 |
| 1001 | 1004 | 2 | 0 | 0 |
| 2000 | 1002 | 50 | 200 | 0.25 |
| 2001 | 2001 | 8 | 80 | 0.1 |
| 2001 | 2003 | 100 | 0 | 0 |
What I have tried:
commuters_new[, amount1 := ifelse(from1==from2 & to1==to2, commuters_old$amount1, 0)]
OR
commuters_new$amount1[, paste(ifelse(from1==from2 & to1==to2, commuters_old$amount1, 0))]
But I always get the error:
Supplied 1703144 items to be assigned to 1408969 items of column 'amount1'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code