In a dataframe I have a character column (one word) where each word can appear multiple times
word = c(
"OMEPRAZOL",
"PARACETAMOL",
"HIDROFEROL",
"ENALAPRIL",
"PARACETAMOL",
"NOISE"
)
In a different dataframe I have a column with strings and another with an associated ID code:
string_code = data.frame(
string = c(
"OMEPRAZOL XXXX",
"OMEPRAZOL YYYY",
"PARACETAMOL/A XXXX",
"PARACETAMOL/B YYYY",
"HIDROFEROL XXXX",
"ENALAPRIL XXXX",
"ENALAPRIL YYYY"),
code = c(
"11",
"11",
"22",
"22",
"33",
"44",
"44")
)
I would like look up for each element of word in string_code$string and when there is a match get in return the associated ID from string_code$code (only the first match since there might be multiple ones, and the ID is the same anyway) - NA if no match.
word_code = data.frame(
word = c(
"OMEPRAZOL",
"PARACETAMOL",
"HIDROFEROL",
"ENALAPRIL",
"PARACETAMOL",
"NOISE"),
code = c(
"11",
"22",
"33",
"44",
"22",
"NA")
)