I have one master dataframe df:
df <- data.frame(c("A", "B", "C"), c(1,2,3), c(3,1,2), c(4,2,1), rep(NA, 3), rep(NA, 3))
colnames(df) <- c("text", "var1", "var2", "var3", "value1", "value2")
And another dataframe df.upd with new information:
df.upd <- data.frame(c(1,2), c(3,1), c(4,2),c(0.5, 0.6), c(12, 20))
colnames(df.upd) <- c("var1", "var2", "var3", "value1", "value2")
> df
text var1 var2 var3 value1 value2
1 A 1 3 4 NA NA
2 B 2 1 2 NA NA
3 C 3 2 1 NA NA
> df.upd
var1 var2 var3 value1 value2
1 1 3 4 0.5 12
2 2 1 2 0.6 20
I want to match columns "var1", "var2", "var3" and update the columns "value1" and "value2". So row 1 and 2 of df.upd would update row 1 and 2 of df, ergo as.numeric(df.upd[row x, 1:3])==as.numeric(df[row y, 2:4]) must be TRUE.
The master df has around 30k rows and 60 columns, so a for loop is not an option. Any idea how to accomplish this faster?
mergewithall.x = TRUE(left join), then useifelsewithis.nato update the relevant columns, then drop extra columnslibrary(data.table); cols <- paste0('value', 1:2); setDT(df)[setDT(df.upd), (cols) := mget(paste0("i.", cols)), on=.(var1, var2, var3)]