1

I'm trying to replace values across values with NA across multiple columns if a condition is met.

Here's a sample dataset:

library(tidyverse)
sample <- tibble(id = 1:6,
                 team_score = 5:10,
                 cent_dept_test_agg = c(1, 2, 3, 4, 5, 6),
                 cent_dept_blue_agg = c(15:20),
                 num_in_dept = c(1, 1, 2, 5, 100, 6))

I want the columns that contain cent_dept_.*_agg to be NA when num_in_dept is 1, so it looks like this:

library(tidyverse)
solution <- tibble(id = 1:6,
                   team_score = 5:10,
                   cent_dept_test_agg = c(NA, NA, 3, 4, 5, 6),
                   cent_dept_blue_agg = c(NA, NA, 17:20),
                   num_in_dept = c(1, 1, 2, 5, 100, 6))

I've tried using replace_with_na_at (from the nanier package) and na_if (from the dplyr package), but I can't figure it out. I know my selection criteria is correct (dplyr::matches("cent_dept_.*_agg"), but I can't figure out the solution.

In my actual dataset, I have many columns that start with cent_dept and end with agg, so it's very important that the selection users that matches component.

Thank you for your help!

1 Answer 1

2

We can use mutate_at to select the columns that matches 'cent_dept' and replace the values where 'num_in_dept' is 1

library(dplyr)
sample %>%
    mutate_at(vars(matches('^cent_dept_.*_agg$')), ~ 
                  replace(., num_in_dept == 1, NA))
# A tibble: 6 x 5
#     id team_score cent_dept_test_agg cent_dept_blue_agg num_in_dept
#  <int>      <int>              <dbl>              <int>       <dbl>
#1     1          5                 NA                 NA           1
#2     2          6                 NA                 NA           1
#3     3          7                  3                 17           2
#4     4          8                  4                 18           5
#5     5          9                  5                 19         100
#6     6         10                  6                 20           6

In base R, we can also do

nm1 <- grep('^cent_dept_.*_agg$', names(sample))
sample[nm1] <- lapply(sample[nm1], function(x) 
         replace(x, sample$num_in_dept == 1, NA))

Or it can be done with

sample[nm1] <-  sample[nm1] * NA^(sample$num_in_dept == 1)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.