1

I am teaching myself to use tidyverse more, as I'm hoping to be able to make cleaner code in the future.

I have data that looks like this:

data <- as_tibble(data.frame(x = c(1,2,3,3,4),
          y = c(3,4,4,2,5),
          z = c(1,1,5,5,3)))

And I would like to get the mean, sd, and confidence intervals for all 3 columns.

The code I am hoping to use is this:

data %>%
summarize_at(vars(x:z), list(mean=mean, sd=sd, cilow = ci[2], cihigh = ci[3]))

where the ci() function is from the gmodels package. When passing a single variable through ci, you can pick which output column to view, but when it's part of a list of functions, I get the error

Error in ci[2] : object of type 'closure' is not subsettable

Any advice/suggestions are appreciated! I am trying not to manually calculate all the CIs (my actual data has many more variables to calculate)

1 Answer 1

1

We can use lambda function. In addition, _at/_all are deprecated in favor of across

library(dplyr)
library(gmodels)
data %>% 
   summarise(across(x:z, list(mean = ~ mean(.x, na.rm = TRUE),
      sd = ~ sd(.x, na.rm = TRUE), 
    cilow = ~ ci(.x)[2], cihigh = ~ ci(.x)[3])))

-output

# A tibble: 1 × 12
  x_mean  x_sd x_cilow x_cihigh y_mean  y_sd y_cilow y_cihigh z_mean  z_sd z_cilow z_cihigh
   <dbl> <dbl>   <dbl>    <dbl>  <dbl> <dbl>   <dbl>    <dbl>  <dbl> <dbl>   <dbl>    <dbl>
1    2.6  1.14    1.18     4.02    3.6  1.14    2.18     5.02      3     2   0.517     5.48

Or with summarise_at

data %>%
 summarize_at(vars(x:z), list(mean=mean, sd=sd, cilow = ~ ci(.)[2], cihigh = ~ ci(.x)[3]))
# A tibble: 1 × 12
  x_mean y_mean z_mean  x_sd  y_sd  z_sd x_cilow y_cilow z_cilow x_cihigh y_cihigh z_cihigh
   <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl>   <dbl>   <dbl>   <dbl>    <dbl>    <dbl>    <dbl>
1    2.6    3.6      3  1.14  1.14     2    1.18    2.18   0.517     4.02     5.02     5.48
Sign up to request clarification or add additional context in comments.

10 Comments

with summarize across I was getting the error Error in dots[[.index]] : subscript out of bounds but keeping with summarize_at, it worked!
@GabrielleMacklin please check your packageVersion('dplyr'). If it is an old version, across maynot work
It's the most recent version
@GabrielleMacklin i used 1.0.8 and it didn't show an error though
@GabrielleMacklin i wouldn't install tidyverse package as the dependencies may have broken. I always install individual packages
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.