Goal: Clean a data frame that has a column (let's call it, v1) containing one or (often) more than one value in each cell. I would like to generate multiple binary variables (say: v1_1, v1_2, v1_3) based on values contained in the cells in v1. (Reality: I have a very large, ugly excel dataset from elsewhere with many cells that have multiple values and would like to efficiently sort them into binary columns, ideally with tidyverse tools, but base works too).
Reproducible example:
df <- data.frame(caseID = c(1:5),
v1 = c(2, 1, "1,3", 1, "2, 3"))
df
desired_df <- data.frame(caseID = c(1:5),
v1_1 = c(0, 1, 1, 1, 0),
v1_2 = c(1, 0, 0, 0, 1),
v1_3 = c(0, 0, 1, 0, 1))
desired_df
cbind(df[1], as.data.frame.matrix(table(stack(setNames(strsplit(as.character(df$v1), ",\\s*"), df$caseID))[2:1])))