I have 8 datasets and I want to apply a function to convert any number less than 5 to NA on 3 columns(var1,var2,var3) of each dataset. How can I write a function to do it effectively and faster ? I went through lots of such questions on Stack overflow but I didnt find any answer where specific columns were used. I have written the function to replace but cant figure out how to apply to all the datasets.
Input:
Data1
variable1 variable2 variable3 variable4
10 36 56 99
15 3 2 56
4 24 1 1
Expected output:
variable1 variable2 variable3 variable4
10 36 56 99
15 NA NA 56
NA 24 NA 1
Perform the same thing for 7 more datasets.
Till now I have stored the needed variables and datasets in two different list.
var1=enquo(variable1)
var2=enquo(variable2)
var3=enquo(variable3)
Total=3
listofdfs=list()
listofdfs_1=list()
for(i in 1:8) {
df=sym((paste0("Data",i)))
listofdfs[[i]]=df
}
for(e in 1:Ttoal) {
listofdfs[[e]]= eval(sym(paste0("var",e)))
}
The selected columns will go through this function:
temp_1=function(x,h) {
h=enquo(h)
for(e in 1:Total) {
if(substr(eval(sym(paste0("var",e))),1,3)=="var") {
y= x %>% mutate_at(vars(!!h), ~ replace(., which(.<=5),NA))
return(y)
}
}
}
I was expecting something :
lapply(for each dataset's selected columns,temp_1)