1

I'm trying to sum the number of NULL values in my dataframe in R. I can easily do so with NA values using the code below:

colSums(is.na(df))

but when I attempt to do this with is.null I get back the following error:

Error in colSums(is.null(df)) : 'x' must be an array of at least two dimensions

I believe the solution is to change the dataframe into another form to do this - but I don't know how and internet searches have proven fruitless (and often conflate NAs and NULLs)

1
  • is.na() returns a logical matrix with the same dimensions as the data frame, whereas is.null() returns a single TRUE/FALSE value. It would help to see some or all of the data df (including the NULL values) using e.g. dput(df) or dput(head(df)). NULL may not mean what you think it means e.g. it may be stored as type character. Commented Nov 22, 2021 at 3:11

1 Answer 1

1

NULL values in data frames are actually empty lists. You have to check whether their length is 0. Either in tidyr:

library(tidyverse)

d <- tribble(~a, ~b,
             "a", NULL,
             NULL, "y",
             "b", "z")
d
# A tibble: 3 x 2
# a         b        
# <list>    <list>   
# 1 <chr [1]> <NULL>   
# 2 <NULL>    <chr [1]>
sum(map_dbl(d, ~length(~.x) > 0))
# [1] 2

or Base R:

d <- data.frame(a = I(list("a", NULL, "b")),
                b = I(list(NULL, "y", "z")))
d
# a b
# 1 a  
# 2   y
# 3 b z
sum(apply(d, 2, function(a) sum(vapply(a, function(b) length(b) == 0L, numeric(1)))))
#[1] 2
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.