I have a dataset with 300 columns and 1000 rows and a corresponding code book in data.table format. For simplicity I am going to give 3 columns for both.
dt <- data.table(id = 1:10,
a = sample(c(1,2,3),10, replace = T),
b = sample(c(1,2) ,10, replace = T),
c = sample(c(1:5) ,10, replace = T))
id a b c
1: 1 2 1 2
2: 2 2 1 1
3: 3 3 1 1
4: 4 3 1 1
5: 5 1 2 5
6: 6 2 1 3
7: 7 1 2 3
8: 8 1 1 2
9: 9 2 1 5
10: 10 3 2 4
cb <- data.table(var = c(rep("a", 3), rep("b", 2), rep("c", 5)),
val = c(1,2,3,1,2,1,2,3,4,5),
des = c("red", "blue", "yellow", "yes","no","K", "Na","Ag","Au","Si"))
var val des
1: a 1 red
2: a 2 blue
3: a 3 yellow
4: b 1 yes
5: b 2 no
6: c 1 K
7: c 2 Na
8: c 3 Ag
9: c 4 Au
10: c 5 Si
In cb, var is the corresponding variable in dt, and val is the value in dt that has the corresponding des value. I want to edit dt by replacing the values in dt by the values in cb. It should look like
id a b c
1: 1 red yes Na
2: 2 yellow no Ag
3: 3 blue yes Ag
4: 4 red yes Au
5: 5 blue yes Ag
6: 6 blue no Au
7: 7 yellow yes Si
8: 8 blue no Ag
9: 9 red no K
10: 10 yellow no Ag
How do I perform an operation like this efficiently and in a way that doesn't sound like my computer has built in piston?
The reason is I have a pre-written code to analyze the data and need the actual values in order to run it. It may also prove useful in general because many times I am given data and a code book, but usually they aren't this many variables.