1

I would like to replace 0 in my data.frame with 1, but only in factor columns, which have only 3 values (0, 1 or NA). I have to avoid also specifying columns by names as my real data set is pretty large and it would be cumbersome. So I thought I could make use of dplyr::mutate_if and try something like:

df %>% mutate_if(~(is.factor(.) & (unique(.) %in% c(0, 1, NA))), ~replace(., . == 0, 1))

but ended up with following error:

Error in selected[[i]] <- .p(.tbl[[vars[[i]]]], ...) : more elements supplied than there are to replace

What is wrong with this formula? How can I make use of dplyr to replace 0 with 1? My example dataset looks like below:

df <- structure(list(a1 = structure(c(1L, NA, NA, 2L, NA, 1L, NA), .Label = c("0", 
"1"), class = "factor"), a2 = structure(c(NA, NA, NA, 1L, NA, 
NA, NA), .Label = "1", class = "factor"), a3 = structure(c(NA, 
1L, 2L, 3L, NA, 4L, 2L), .Label = c("0", "1", "2", "6"), class = "factor"), 
a4 = structure(c(1L, 1L, NA, NA, NA, NA, 1L), .Label = "0", class = 
"factor"), 
a5 = c(0L, 1L, 1L, NA, 1L, 0L, NA)), .Names = c("a1", "a2", 
"a3", "a4", "a5"), class = c("tbl_df", "tbl", "data.frame"), row.names = 
c(NA, -7L))
2
  • 2
    All the column in your example are numeric, not factor Commented Jun 21, 2018 at 12:39
  • example edited to match the case Commented Jun 21, 2018 at 12:55

2 Answers 2

1

can be solved like this:

df %>%
mutate_if(~(is.factor(.) & (all(unique(.) %in% c(0, 1, NA)))), ~plyr::revalue(., c("0"="1")))

# # A tibble: 7 x 5
#   a1    a2    a3    a4       a5
#   <fct> <fct> <fct> <fct> <int>
# 1 1     <NA>  <NA>  1         0
# 2 <NA>  <NA>  0     1         1
# 3 <NA>  <NA>  1     <NA>      1
# 4 1     1     2     <NA>     NA
# 5 <NA>  <NA>  <NA>  <NA>      1
# 6 1     <NA>  6     <NA>      0
# 7 <NA>  <NA>  1     1        NA
Sign up to request clarification or add additional context in comments.

Comments

0

How about this?

df %>%
    mutate_if(is.factor, funs(ifelse(as.character(.) == "0", "1", as.character(.)))) %>%
    mutate_if(is.character, as.factor)
## A tibble: 7 x 5
#  a1    a2    a3    a4       a5
#  <fct> <fct> <fct> <fct> <int>
#1 1     NA    NA    1         0
#2 NA    NA    1     1         1
#3 NA    NA    1     NA        1
#4 1     1     2     NA       NA
#5 NA    NA    NA    NA        1
#6 1     NA    6     NA        0
#7 NA    NA    1     1        NA

3 Comments

Not entirely applicable as I have also character vars in my original dataset
not correct. OP wants the rule to be applied to cols consisting of only (0, 1 or NA).
@jakes ok I see. I wasn’t entirely clear on which columns the rule needed to be applied. I understood all factor columns. Should’ve read more carefully...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.