I've this list of sequences aqi_range and a dataframe df:
aqi_range = list(0:50,51:100,101:250)
df
PM10_mean PM10_min PM10_max PM2.5_mean PM2.5_min PM2.5_max
1 85.6 3 264 75.7 3 240
2 105. 6 243 76.4 3 191
3 95.8 19 287 48.4 8 134
4 85.5 50 166 64.8 32 103
5 55.9 24 117 46.7 19 77
6 37.5 6 116 31.3 3 87
7 26 5 69 15.5 3 49
8 82.3 34 169 49.6 25 120
9 170 68 272 133 67 201
10 254 189 323 226 173 269
Now I've created these two pretty simple functions that i want to apply to this dataframe to calculate the AQI=Air Quality Index for each pollutant.
#a = column from a dataframe **PM10_mean, PM2.5_mean**
#b = list of sequences defined above
min_max_diff <- function(a,b){
for (i in b){
if (a %in% i){
min_val = min(i)
max_val = max(i)
return (max_val - min_val)
}}}
#a = column from a dataframe **PM10_mean, PM2.5_mean**
#b = list of sequences defined above
c_low <- function(a,b){
for (i in b){
if (a %in% i){
min_val = min(i)
return(min_val)
}
}}
Basically the first function "min_max_diff" takes the value of column df$PM10_mean / df$PM2.5_mean and check for it in the list "aqi_range" and then returns a certain value (difference of min and max value of the sequence in which it's available). Similarly the second function "c_low" just returns the minimum value of the sequence.
I want to apply this kind of manipulation (formula defined below) to PM10_mean column to create new columns PM10_AQI:
df$PM10_AQI = min_max_diff(df$PM10_mean,aqi_range) / (df$PM10_max - df$PM10_min) / * (df$PM10_mean - df$PM10_min) + c_low(df$PM10_mean,aqi_range)
I hope it explains it properly.
dput(df)?diffis abase Rfunction. Please use another namePM10_meanis85.6and you are checking forif (a %in% i)in the function. None of the values inaqi_rangesatisfies this criterion soa %in% iwill never be true. Note thataqi_rangehas all integers whereas numbers inPM10_meanare decimals and you are performing an exact match. Do you want to check if the numbers are in range or something ? Also in the last part where you have shared data I am assuming your input has only two columnsPM10_meanandPM2.5_mean, rest of them are your expected output columns.