3

I am trying to write a function that uses dplyr::summarise to obtain means of multiple columns of a data frame and assign dynamic names to the summarised columns using the new rlang glue syntax and := operator.

Here's a simple example of my problem using the mtcars dataset.

When summarising over just one column - the glue syntax works (i.e. the summarised column name is mean_mpg):

mean_fun <- function(data, group_cols, summary_col) {
 data %>%
 group_by(across({{ group_cols }})) %>%
 summarise("mean_{{ summary_col }}" := mean({{ summary_col }}, na.rm = T))
}
mean_fun(mtcars, c(cyl, gear), mpg)

   cyl  gear mean_mpg
  <dbl> <dbl>    <dbl>
1     4     3     21.5
2     4     4     26.9
3     4     5     28.2
4     6     3     19.8
5     6     4     19.8
6     6     5     19.7
7     8     3     15.0
8     8     5     15.4

But the equivalent does not name the cols properly when summarising over multiple columns:

mean_fun_multicols <- function(data, group_cols, summary_cols) {
  data %>%
    group_by(across({{ group_cols }})) %>%
    summarise("mean_{{ summary_cols }}" := across({{ summary_cols }}, ~ mean(., na.rm = T)))
}
mean_fun_multicols(mtcars, c(cyl, gear), c(mpg, wt))

    cyl  gear `mean_c(mpg, wt)`$mpg   $wt
  <dbl> <dbl>                 <dbl> <dbl>
1     4     3                  21.5  2.46
2     4     4                  26.9  2.38
3     4     5                  28.2  1.83
4     6     3                  19.8  3.34
5     6     4                  19.8  3.09
6     6     5                  19.7  2.77
7     8     3                  15.0  4.10
8     8     5                  15.4  3.37

How can I get the summarised column names to read mean_mpg and mean_wt? And why does this not work?

I realise that there are likely many other ways to perform this task but I would like to know how to get this method (i.e. using tidy eval, rlang syntax in a bespoke function) to work for teaching purposes and my own understanding!

Thank you

1 Answer 1

2

We could use .names in across to rename

mean_fun_multicols <- function(data, group_cols, summary_cols) {
  data %>%
    group_by(across({{group_cols}})) %>%
     summarise(across({{ summary_cols }},
         ~ mean(., na.rm = TRUE), .names = "mean_{.col}"), .groups = "drop")
}

-testing

mean_fun_multicols(mtcars, c(cyl, gear), c(mpg, wt))
# A tibble: 8 × 4
    cyl  gear mean_mpg mean_wt
  <dbl> <dbl>    <dbl>   <dbl>
1     4     3     21.5    2.46
2     4     4     26.9    2.38
3     4     5     28.2    1.83
4     6     3     19.8    3.34
5     6     4     19.8    3.09
6     6     5     19.7    2.77
7     8     3     15.0    4.10
8     8     5     15.4    3.37

NOTE: The := is mainly used when there is a single column in tidyverse


If we use the OP's function, we are assigning multiple columns to a single column and this returns a tibble instead of a normal column. We may need to unpack

library(tidyr)
> mean_fun_multicols(mtcars, c(cyl, gear), c(mpg, wt)) %>% str
`summarise()` has grouped output by 'cyl'. You can override using the `.groups` argument.
grouped_df [8 × 3] (S3: grouped_df/tbl_df/tbl/data.frame)
 $ cyl            : num [1:8] 4 4 4 6 6 6 8 8
 $ gear           : num [1:8] 3 4 5 3 4 5 3 5
 $ mean_c(mpg, wt): tibble [8 × 2] (S3: tbl_df/tbl/data.frame)
  ..$ mpg: num [1:8] 21.5 26.9 28.2 19.8 19.8 ...
  ..$ wt : num [1:8] 2.46 2.38 1.83 3.34 3.09 ...
 - attr(*, "groups")= tibble [3 × 2] (S3: tbl_df/tbl/data.frame)
  ..$ cyl  : num [1:3] 4 6 8
  ..$ .rows: list<int> [1:3] 
  .. ..$ : int [1:3] 1 2 3
  .. ..$ : int [1:3] 4 5 6
  .. ..$ : int [1:2] 7 8
  .. ..@ ptype: int(0) 
  ..- attr(*, ".drop")= logi TRUE

> mean_fun_multicols(mtcars, c(cyl, gear), c(mpg, wt)) %>% 
        unpack(where(is_tibble))
`summarise()` has grouped output by 'cyl'. You can override using the `.groups` argument.
# A tibble: 8 × 4
# Groups:   cyl [3]
    cyl  gear   mpg    wt
  <dbl> <dbl> <dbl> <dbl>
1     4     3  21.5  2.46
2     4     4  26.9  2.38
3     4     5  28.2  1.83
4     6     3  19.8  3.34
5     6     4  19.8  3.09
6     6     5  19.7  2.77
7     8     3  15.0  4.10
8     8     5  15.4  3.37
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.