I have a dataframe with multiple columns. I have reduced the data frame size to illustrate my ask.
One column 'A' has a complete set of 6 values. The remaining 5 columns 'v1' to 'v5' randomly have 2 missing values each labelled NA.
df <- data.frame('A' = c(2, 4, 7, 5, 3, 4), 'v1' = c(3, NA, NA, 4, 5, 5),
'v2' = c(NA, NA, 6, 4, 5, 5), 'v3' = c(3, 4, NA, NA, 5, 5),
'v4' = c(3, 4, 6, 4, NA, NA), 'v5' = c(3, 4, 6, NA, NA, 5))
A v1 v2 v3 v4 v5
1 2 3.00 1.75 3.00 3.00 3.00
2 4 3.55 3.55 4.00 4.00 4.00
3 7 6.25 6.00 6.25 6.00 6.00
4 5 4.00 4.00 4.45 4.00 4.45
5 3 5.00 5.00 5.00 2.65 2.65
6 4 5.00 5.00 5.00 3.55 5.00
What I would like to do is fill in all NAs in the dataframe using an equation: -0.05 + 0.9*x . Where x corresponds to the value in Column A in the same row. For example:
For v1 row 2 where there is the first NA, Col A = 4. So I would like this NA to be filled as follows:
-0.05 + 0.9*4 = 3.55 ------- Filled with 3.55
And for v1 row 3 NA, where Col A = 7. I would like -0.05 + 0.9*7 = 6.25 ------ to be filled with 6.25
I was trying to utilise the ifelse() function, but do not know how to apply it to the whole dataframe and linking it to an equation that uses a value from another column in the same row.
My attempt is below, which I know is wrong but gives an idea of my approach to it:
ifelse(df$v1:v5 == NA, -0.05 + 0.9*df$A, df$v1:v5)
dputto give us real example data to work with rather than a screenshot of a data frame. Also, you should applyifelseto each column of the data frame, perhaps usingapply, and not to the whole data frame.