9

I had a dataframe where I recoded several columns so that 999 was set to NA

dfB <-dfA %>%
  mutate(adhere = if_else(adhere==999, as.numeric(NA), adhere)) %>%
  mutate(engage = if_else(engage==999, as.numeric(NA), engage)) %>%
  mutate(quality = if_else(quality==999, as.numeric(NA), quality)) %>%
  mutate(undrstnd = if_else(undrstnd==999, as.numeric(NA), undrstnd)) %>%
  mutate(sesspart = if_else(sesspart==999, as.numeric(NA), sesspart)) %>%
  mutate(attended = if_else(attended>=9, as.integer(NA), attended))

I want to use mutate_at() and a range of columns and recode() instead of if_else(), but I am stuck on how to give it the condition. I think something like 999 = NA based on some mutate_all examples -- but I also need the NA to match the type of .x and I am unsure how to get it to be type sensitive

I tried:

y <- data.frame(y1=c(1,2,999,3,4), y2=c(1L, 2L, 999L, 3L, 4L), y3=c(T,T,F,F,T))
z <- y %>%
    mutate_at( vars(y1:y2), funs(recode(.,`999` = as.numeric(NA))))

But I get a warning "Unreplaced values treated as NA as .x is not compatible. Please specify replacements exhaustively or supply .default " and I can see that it worded for the numeric column, but not for the integer column y2"

> z
  y1 y2    y3
1  1 NA  TRUE
2  2 NA  TRUE
3 NA NA FALSE
4  3 NA FALSE
5  4 NA  TRUE

5 Answers 5

10

I think it is related the column type. I added mutate_if to convert all integer columns to numeric, and then set the recode value to be NA_real_. It seems working.

library(dplyr)

y <- data.frame(y1=c(1,2,999,3,4), y2=c(1L, 2L, 999L, 3L, 4L), y3=c(T,T,F,F,T))

z <- y %>%
  mutate_if(is.integer, as.numeric) %>%
  mutate_at(vars(y1:y2), funs(recode(.,`999` = NA_real_)))
z
#   y1 y2    y3
# 1  1  1  TRUE
# 2  2  2  TRUE
# 3 NA NA FALSE
# 4  3  3 FALSE
# 5  4  4  TRUE
Sign up to request clarification or add additional context in comments.

1 Comment

thanks, www. this does solve the problem of the warning. it forces everything to real and avoids the wrong kind of NA for the previously integer columns. I had considered this. I have some other parts of the code that counts on those columns being integer and I will need to go reset them to integer after recoding. I was hoping for a way to make the NA value responsive to the kind of number in each column.
8

Now that funs has been depreciated in dplyr, here's the new way to go:

z <- y %>%
  mutate_if(is.integer, as.numeric) %>%
  mutate_at(vars(y1:y2), list(~recode(.,`999` = NA_real_)))

Replace funs with list and insert a ~ before recode.

1 Comment

list() isn't necessary if only one function is called.
7

Currently, based on dplyr documentation:

across() supersedes the family of "scoped variants" like summarise_at(), summarise_if(), and summarise_all().

So, using mutate and across instead is now recommended.

Like Chris LeBoa said, if you only want to convert an annoying value to NA, the function na_if() is probably the best choice:

y <- data.frame(y1=c(1,2,999,3,4), y2=c(1L, 2L, 999L, 3L, 4L), y3=c(T,T,F,F,T))

y
   y1  y2    y3
1   1   1  TRUE
2   2   2  TRUE
3 999 999 FALSE
4   3   3 FALSE
5   4   4  TRUE
 
z <- y %>%
    mutate(across(
        y1:y2,
        ~na_if(., 999)
    ))

z
  y1 y2    y3
1  1  1  TRUE
2  2  2  TRUE
3 NA NA FALSE
4  3  3 FALSE
5  4  4  TRUE

Similarly, if you really want to recode values in multiple columns, you can follow the example from bcarothers:

df1 <- tibble(Q7_1=1:5,
              Q7_1_TEXT=c("let's","see","grogu","this","week"),
              Q8_1=6:10,
              Q8_1_TEXT=rep("grogu",5),
              Q8_2=11:15,
              Q8_2_TEXT=c("grogu","is","the","absolute","best"))

df2 <- df1 %>%
    mutate(across(
        starts_with("Q8") & ends_with("TEXT"),
        ~recode(., "grogu"="mando")
    ))

Comments

7

I'm having trouble understanding exactly what you want to accomplish, so let me know if this isn't quite it.


library(dplyr)

y <- data.frame(y1=c(1,2,999,3,4), y2=c(1L, 2L, 999L, 3L, 4L), y3=c(T,T,F,F,T))

y

#>    y1  y2    y3
#> 1   1   1  TRUE
#> 2   2   2  TRUE
#> 3 999 999 FALSE
#> 4   3   3 FALSE
#> 5   4   4  TRUE

z <- y %>%
  mutate_at(vars(y1:y2), ~ifelse(. == 999, NA, .))

z

#>   y1 y2    y3
#> 1  1  1  TRUE
#> 2  2  2  TRUE
#> 3 NA NA FALSE
#> 4  3  3 FALSE
#> 5  4  4  TRUE

8 Comments

thanks evertr. This does solve the problem. It retains the if_else() instead of using recode() - but I can live with that. I can use the "." as you suggest to avoid changing to numeric. I am not clear why it does not complain that the NA for true is the wrong type. In my original code I had to use as.numeric(NA) or as.integer(NA) to avoid errors. DO you know why it does not give an error here?
ahh, OK. I see that you used ifelse() which does not check the type the same way that if_else() does. Do you know how this could be done with if_else() without casting the whole data frame as real?
@D.Bontempo you could use mutate_if(is.numeric, ...), which also matches integers, such that you don't have to select all the variables (like in the solution from @www, but without converting anything). @everetr I'd recommend to remove that as.numeric from your solution, because there is no need for type-convertion. Then it would be +1-worthy ;-)
@Tino, See my comment below the code. I already said one can omit as.numeric, if desired. I didn't know if @D. Bontempo wanted conversion or not.
@D.Bontempo See my comment below the code. You can omit as.numeric to avoid converting y$y2 from integer to numeric.
|
1

If you are trying to recode something to an NA the na_if() function should also work.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.