0

I have a data frame where I'm trying to count the number of non NA values

csv <- "A,B,C,D,E,F,G
NA,NA,1,NA,NA,NA,NA
0,NA,0,NA,1,NA,1
NA,1,1,NA,0,NA,NA
0,1,1,0,NA,NA,0"

temp <- read.csv(text=csv)
temp %>% mutate(twopresent = ifelse((!is.na(A)+!is.na(B)+!is.na(C)) >= 2, TRUE, FALSE),
total = sum(A,B,C, na.rm=TRUE) )

I expect to see:

   A  B C  D  E  F  G twopresent total
1 NA NA 0 NA NA NA NA      FALSE     0
2  0 NA 0 NA  1 NA  1      TRUE      0
3 NA  1 1 NA  0 NA NA      TRUE      2
4  0  1 0  0 NA NA  0      TRUE      1

But get:

   A  B C  D  E  F  G twopresent total
1 NA NA 1 NA NA NA NA      FALSE     5
2  0 NA 0 NA  1 NA  1      FALSE     5
3 NA  1 1 NA  0 NA NA      FALSE     5
4  0  1 1  0 NA NA  0      FALSE     5

rowSums solution in dplyr:

temp %>% mutate(twopresent = 
                ifelse(rowSums(!is.na(A), !is.na(B), !is.na(C)) >= 2, 
                  TRUE, 
                  FALSE),
                total = rowSums(A,B,C, na.rm=TRUE) )

gives:

Error: 'x' must be an array of at least two dimensions

6
  • A row-wise count of non-NA values is easy to compute with rowSums as in rowSums(!is.na(temp[c("A", "B", "C")])) Commented Aug 31, 2016 at 10:20
  • 1
    And for the total column, it's simply rowSums(temp[1:3], na.rm = TRUE) (you've messed up your desired output a bit). This is why you should learn R before you learn dplyr Commented Aug 31, 2016 at 10:22
  • @pluke are you sure total is correct? Commented Aug 31, 2016 at 10:23
  • sorry, mistakenly copied wrong version of thecode, have updated. Will try rowSums Commented Aug 31, 2016 at 10:24
  • Have got the following to work: temp$total <- rowSums(!is.na(temp[c("A", "B", "C")])) temp$twopresent <- ifelse(rowSums(!is.na(temp[c("A", "B", "C")])) >= 2, TRUE, FALSE) I can't get rowSums to work correctly in dplyr however. See example above Commented Aug 31, 2016 at 10:31

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.