I was wondering if there was a clean solution using data.table to the following problem possibly using other packages such as stringr.
Suppose I have the following data table
DT <- data.table(name = c("Carlos", "Henry", "John"),
ID = c("US115115, CH123232, AB155, US4445", "CH112, BB53", "US57677777"))
This looks like:
name ID
1: Carlos US115115, CH123232, AB155, US4445
2: Henry CH112, BB53
3: John US57677777
What I want to do is create another column, ID2, say, that takes the column ID and extracts only the "US identities" and creates a new column so that the final data table should look like:
name ID ID2
1: Carlos US115115, CH123232, AB155, US4445 US115115, US4445
2: Henry CH112, BB53 NA
3: John US57677777 US57677777
and where each element is a string. I've been able to code a version where it takes the first "US identity" and discards the rest, but I haven't been able to find a solution that handles multiplicity.
Any help would be greatly appreciated!