I've looked all around for this, but have found no answers. I have a data frame that contains columns with multiple levels along the lines of "Unknown" "No response" or "Refused to answer" and the like. All of these are useless to me for analysis, so I want to replace them all with NA.
Note that I do not want to replace them across the entire data frame, only specific columns! There are other columns that contain values with the same names that are actually useful to me and I want to leave them alone.
I've managed to replace them one at a time by using:
data$col1 <- factor(gsub("Unknown", "NA", data$col1))
but that only works for one string at a time. If I try to add multiple strings, R throws an error. Is there a more efficient way to do this?
I'm relatively new to coding, please be gentle!
na.stringsin read.csv i.e. while reading the dataset, you can specify which values can be changed to NA,dat <- read.csv("yourfile.csv", na.strings = c("Unknown", "No response", "Refused to answer"))data$col1 <- factor(gsub("Unknown|No response|Refused to answer", "NA", data$col1)).