I have a dataframe (df, a sample of which is shown below). I want to choose values from column a1, b1 and c1 and take the average, if values in a2, b2, and c2 are positive. For example, in the first row of the df, all values in a2, b2, and c2 are positive, I then pick the corresponding values in a1, b1, and c1, and average them. The result is 0.4933. In the second column, only the value in c2 is positive, I will then pick the value in c1 (0.01).
a1 b1 c1 a2 b2 c2 desired outcome
0.51 0.49 0.48 0.05 0.03 0.09 0.493333
0.33 0.31 0.3 -0.03 -0.05 0.01 0.01
0.22 0.2 0.19 0.04 0.02 0.08 0.203333
0.54 0.52 0.51 -0.05 0.08 -0.01 0.08
0.45 0.43 0.42 -0.03 -0.05 0.01 0.01
Below is my code where I listed all scenarios. I am looking for more efficient codes that can handle more columns.
df2 <- df1 %>% select(c(a2,b2,c2)) %>%
mutate(outcome = ifelse(a2 >0 & b2>0 & c2>0, mean(a1,b1,c1),
ifelse(a2>0 & b2>0 &c2<0, mean(a1,b1),
ifelse(a2>0&b2<0&c2<0, mean(a1),
ifelse(a2<0&b2>0&c2>0, mean(b2,c2),
ifelse(a2<0&b2<0&c2>0, mean(c2),
mean(b2)))))))
df1$`desired outcome`<- rowMeans(df1[ , grepl( "1" , names( df1 ) ) ] * (df1[ , grepl( "2" , names( df1 ) ) ]>0))