I'm writing a function to clean some CEX data (doesn't really matter), and I cannot figure out why I am unable to use %in% to subset a data frame with a list when I am able to perform the analogous operation with == on a single item. What I am attempting to perform is like f_fails() below. Unless I'm mistaken, I need to be able to feed a string but cannot.
Is there something distinct about %in% in items 6 and 8 below that does not apply for ==? How can I perform 6 and 8 in another way?
# Test Data
set.seed(123)
df <- data.frame(
NEWID = rep(1:10, 1, each = 10),
COST = rnorm(100, 1000, 10),
UCC = round(runif(100, 3995, 4005))
)
# All of these work except the 6th one
# 1.
df[df$UCC == 4000,]
# 2.
df[df$"UCC" == 4000,]
# 3.
df[df["UCC"] == 4000,]
# 4.
df[df$UCC %in% c(4000,4001),]
# 5.
df[df$"UCC" %in% c(4000,4001),]
# 6. The one I need does not work
df[df["UCC"] %in% c(4000,4001),]
# 7. This works fine
f_works <- function(data, filter_var, one_val){
# I can feed values with == and filter
d <- data[data[filter_var] == one_val,]
d
}
# 8. This (what I want) returns an empty data frame.
f_fails <- function(data = df, filter_var, many_vals){
# I cannot feed 2+ values with %in% and filter
d <- data[data[filter_var] %in% many_vals,]
d
}
f_works(df, "UCC", 4000)
f_fails(df, "UCC", c(4000,4001))