1

I am mainly interested in replacing a specific value (81) in many columns across the dataframe.

For example, if this is my dataset

    Id         Date         Col_01     Col_02   Col_03       Col_04
    30         2012-03-31   1          A42.2    20.46        43  
    36         1996-11-15   42         V73      23           55
    96         2010-02-07   X48        81       13           3R
    40         2010-03-18   AD14       18.12    20.12        36
    69         2012-02-21   8          22.45    12           10                 
    11         2013-07-03   81         V017     78.12        81         
    22         2001-06-01   11         09       55           12
    83         2005-03-16   80.45      V22.15   46.52        X29.11 
    92         2012-02-12   1          4        67           12 
    34         2014-03-10   82.12      N72.22   V45.44       10

I like to replace value 81 in columns Col1, Col2, Col3, Col4 to NA. The final expected dataset like this

    Id         Date         Col_01     Col_02   Col_03       Col_04
    30         2012-03-31   1          A42.2    20.46        43  
    36         1996-11-15   42         V73      23           55
    96         2010-02-07   X48        **NA     13           3R
    40         2010-03-18   AD14       18.12    20.12        36
    69         2012-02-21   8          22.45    12           10                 
    11         2013-07-03   **NA       V017     78.12      **NA         
    22         2001-06-01   11         09       55           12
    83         2005-03-16   80.45      V22.15   46.52        X29.11 
    92         2012-02-12   1          4        67           12 
    34         2014-03-10   82.12      N72.22   V45.44       10

I tried this approach

df %>% select(matches("^Col_\\d+$"))[ df %>% select(matches("^Col_\\d+$")) == 81 ] <- NA

Something similar to this solution data[ , 2:3 ][ data[ , 2:3 ] == 4 ] <- 10 here Replacing occurrences of a number in multiple columns of data frame with another value in R

This did not work.

Any suggestion is much appreciated. Thanks in adavance.

2 Answers 2

3

Instead of select, we can directly specify the matches in mutate to replace the values that are '81' to NA (use na_if)

library(dplyr)
df <- df %>%
   mutate(across(matches("^Col_\\d+$"), ~ na_if(., "81")))

-output

df
   Id       Date Col_01 Col_02 Col_03 Col_04
1  30 2012-03-31      1  A42.2  20.46     43
2  36 1996-11-15     42    V73     23     55
3  96 2010-02-07    X48   <NA>     13     3R
4  40 2010-03-18   AD14  18.12  20.12     36
5  69 2012-02-21      8  22.45     12     10
6  11 2013-07-03   <NA>   V017  78.12   <NA>
7  22 2001-06-01     11     09     55     12
8  83 2005-03-16  80.45 V22.15  46.52 X29.11
9  92 2012-02-12      1      4     67     12
10 34 2014-03-10  82.12 N72.22 V45.44     10

Or we can use base R

i1 <- grep("^Col_\\d+$", names(df))
df[i1][df[i1] == "81"] <- NA

The issue in the OP's code is the assignment is not triggered as we expect i.e.

(df %>% 
     select(matches("^Col_\\d+$")))[(df %>% 
        select(matches("^Col_\\d+$"))) == "81" ]
[1] "81" "81" "81"

which is same as

df[i1][df[i1] == "81"]
[1] "81" "81" "81"

and not the assignment

(df %>% 
      select(matches("^Col_\\d+$")))[(df %>% 
         select(matches("^Col_\\d+$"))) == "81" ] <- NA
Error in (df %>% select(matches("^Col_\\d+$")))[(df %>% select(matches("^Col_\\d+$"))) ==  : 
  could not find function "(<-"

In base R, it does the assignment with [<-

data

df <- structure(list(Id = c(30L, 36L, 96L, 40L, 69L, 11L, 22L, 83L, 
92L, 34L), Date = c("2012-03-31", "1996-11-15", "2010-02-07", 
"2010-03-18", "2012-02-21", "2013-07-03", "2001-06-01", "2005-03-16", 
"2012-02-12", "2014-03-10"), Col_01 = c("1", "42", "X48", "AD14", 
"8", "81", "11", "80.45", "1", "82.12"), Col_02 = c("A42.2", 
"V73", "81", "18.12", "22.45", "V017", "09", "V22.15", "4", "N72.22"
), Col_03 = c("20.46", "23", "13", "20.12", "12", "78.12", "55", 
"46.52", "67", "V45.44"), Col_04 = c("43", "55", "3R", "36", 
"10", "81", "12", "X29.11", "12", "10")),
 class = "data.frame", row.names = c(NA, 
-10L))
Sign up to request clarification or add additional context in comments.

1 Comment

Also is.na(df[i1]) <- df[i1] == "81".
0

We can also use replace:

library(dplyr)

df <- df %>%
   mutate(across(matches("^Col_\\d+$"), ~ replace(.x, ~.x==81, NA)))

1 Comment

for some reason your solution, give me an ...Error: Problem with mutate()` input...``.x invalid subscript type' language' error

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.