2
    # Create a data frame
        > df <- data.frame(a = rnorm(7), b = rnorm(7), c = rnorm(7), threshold = rnorm(7))
        > df <- round(abs(df), 2)
        > 
        > df
             a    b    c threshold
        1 1.17 0.27 1.26      0.19
        2 1.41 1.57 1.23      0.97
        3 0.16 0.11 0.35      1.34
        4 0.03 0.04 0.10      1.50
        5 0.23 1.10 2.68      0.45
        6 0.99 1.36 0.17      0.30
        7 0.28 0.68 1.22      0.56
        > 
        >
    # Replace values in columns a, b, and c with NA if > value in threshold
        > df[1:3][df[1:3] > df[4]] <- "NA"
        Error in Ops.data.frame(df[1:3], df[4]) : 
          ‘>’ only defined for equally-sized data frames

There could be some obvious solutions that I am incapable of producing. The intent is to replace values in columns "a", "b", and "c" with NA if the value is larger than that in "threshold". And I need to do that row-by-row.

If I had done it right, the df would look like this:

         a    b    c threshold
    1   NA   NA   NA      0.19
    2   NA   NA   NA      0.97
    3 0.16 0.11 0.35      1.34
    4 0.03 0.04 0.10      1.50
    5 0.23   NA   NA      0.45
    6   NA   NA 0.17      0.30
    7 0.28   NA   NA      0.56

I had also tried the apply() approach but to no avail. Can you help, please??

1
  • When you say you also "tried the apply approach" what did you try? Commented Mar 1, 2019 at 5:41

4 Answers 4

3

You should use dplyr for most of such use cases. One way below:

> set.seed(10)
> df <- data.frame(a = rnorm(7), b = rnorm(7), c = rnorm(7), threshold = rnorm(7))
> df <- round(abs(df), 2)
> df
     a    b    c threshold
1 0.02 0.36 0.74      2.19
2 0.18 1.63 0.09      0.67
3 1.37 0.26 0.95      2.12
4 0.60 1.10 0.20      1.27
5 0.29 0.76 0.93      0.37
6 0.39 0.24 0.48      0.69
7 1.21 0.99 0.60      0.87
> 
> df %>%
+   mutate_at(vars(a:c), ~ifelse(.x > df$threshold, NA, .x))
     a    b    c threshold
1 0.02 0.36 0.74      2.19
2 0.18   NA 0.09      0.67
3 1.37 0.26 0.95      2.12
4 0.60 1.10 0.20      1.27
5 0.29   NA   NA      0.37
6 0.39 0.24 0.48      0.69
7   NA   NA 0.60      0.87
Sign up to request clarification or add additional context in comments.

Comments

2

You can use apply function across dataframe

df[,c(1:3)]<- apply(df[,c(1:3),drop=F], 2, function(x){ ifelse(x>df[,4],NA,x)})

Comments

1

The problem with your code was the usage of df[4] instead of df[, 4]. The difference is that df[4] returns a data.frame with one column and df[, 4] returns a vector.

That's why

df[1:3] > df[4]

returns

error in Ops.data.frame(df[1:3], df[4]) : ‘>’ only defined for equally-sized data frames

While this works as expected

df[1:3][df[1:3] > df[, 4]] <- NA
df
#     a    b    c threshold
#1 0.63 0.74   NA      0.78
#2   NA   NA 0.04      0.07
#3 0.84 0.31 0.02      1.99
#4   NA   NA   NA      0.62
#5   NA   NA   NA      0.06
#6   NA   NA   NA      0.16
#7 0.49   NA 0.92      1.47

data

set.seed(1)
df <- data.frame(a = rnorm(7), b = rnorm(7), c = rnorm(7), threshold = rnorm(7))
df <- round(abs(df), 2)

Comments

0

You can use a for-loop like this:

for(i in 1:(ncol(df)-1)){ 
  df[, i] <- ifelse(df[, i] > df[, 4], NA, df[, i])
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.