2

I'm trying to generate multiple new columns/variables in a R dataframe with dynamic new names taken from a vector. The new variables are computed from groups/levels of a single column.

The dataframe contains measurements (counts) of different chemical elements (element) along depth (z). The new variables are computed by dividing the counts of each element at a certain depth by the respective counts of proxy elements (proxies) at the same depth.

There is already a solution using mutate that works if I only want to create one new column/name the columns explicitly (see code below). I'm looking for a generalised solution to use in a shiny web app where proxies is not a string but a vector of strings and is dynamically changing according to user input.

# Working code for just one new column at a time (here Ti_ratio)

proxies <- "Ti"
df <- tibble(z = rep(1:10, 4), element = rep(c("Ag", "Fe", "Ca", "Ti"), each = 10), counts = rnorm(40))

df_Ti <- df %>%
  group_by(z) %>%
  mutate(Ti_ratio = counts/counts[element %in% proxies])
# Not working code for multiple columns at a time

proxies <- c("Ca", "Fe", "Ti")
varname <- paste(proxies, "ratio", sep = "_")

df_ratios <- df %>%
  group_by(z) %>%
  map(~ mutate(!!varname = .x$counts/.x$counts[element %in% proxies]))

Output of working code:

> head(df_Ti)
# A tibble: 6 x 4
# Groups:   z [6]
      z element counts Ti_ratio
  <int> <chr>    <dbl>    <dbl>
1     1 Ag       2.41     4.10 
2     2 Ag      -1.06    -0.970
3     3 Ag      -0.312   -0.458
4     4 Ag      -0.186    0.570
5     5 Ag       1.12    -1.38 
6     6 Ag      -1.68    -2.84

Expected output of not working code:

> head(df_ratios)
# A tibble: 6 x 6
# Groups:   z [6]
      z element counts Ca_ratio Fe_ratio Ti_ratio
  <int> <chr>    <dbl>    <dbl>    <dbl>    <dbl>
1     1 Ag       2.41     4.78   -10.1      4.10 
2     2 Ag      -1.06     3.19     0.506   -0.970
3     3 Ag      -0.312   -0.479   -0.621   -0.458
4     4 Ag      -0.186   -0.296   -0.145    0.570
5     5 Ag       1.12     0.353    3.19    -1.38 
6     6 Ag      -1.68    -2.81    -0.927   -2.84 

Edit: I found a general solution to my problem with base R using two nested for-loops, similar to the answer posted by @fra (the difference being that here I loop both over the depth and the proxies):

library(tidyverse)
df <- tibble(z = rep(1:3, 4), element = rep(c("Ag", "Ca", "Fe", "Ti"), each = 3), counts = runif(12)) %>% arrange(z, element)
proxies <- c("Ca", "Fe", "Ti")

for (f in seq_along(proxies)) {
  proxy <- proxies[f]
  tmp2 <- NULL
  for (i in unique(df$z)) {
    tmp <- df[df$z == i,]
    tmp <- as.data.frame(tmp$counts/tmp$counts[tmp$element %in% proxy])
    names(tmp) <- paste(proxy, "ratio", sep = "_")
    tmp2 <- rbind(tmp2, tmp)
  }
  df[, 3 + f] <- tmp2
}

And the correct output:

> head(df)
# A tibble: 6 x 6
      z element counts Ca_ratio Fe_ratio Ti_ratio
  <int> <chr>    <dbl>    <dbl>    <dbl>    <dbl>
1     1 Ag      0.690    0.864      9.21    1.13 
2     1 Ca      0.798    1         10.7     1.30 
3     1 Fe      0.0749   0.0938     1       0.122
4     1 Ti      0.612    0.767      8.17    1    
5     2 Ag      0.687    0.807      3.76    0.730
6     2 Ca      0.851    1          4.66    0.904

I made the dataframe contain less data so that it's clearly visible why this solution is correct (Ratios of elements with themselves = 1). I'm still interested in a more elegant solution that I could use with pipes.

0

2 Answers 2

2

A tidyverse option could be to create a function, similar to your original code and then pass that through using map_dfc to create new columns.

library(tidyverse)

proxies <- c("Ca", "Fe", "Ti")

your_func <- function(x){

    df %>% 
       group_by(z) %>%
       mutate(!!paste(x, "ratio", sep = "_") := counts/counts[element %in% !!x]) %>% 
       ungroup() %>%
       select(!!paste(x, "ratio", sep = "_") )
}

df %>% 
   group_modify(~map_dfc(proxies, your_func)) %>% 
   bind_cols(df, .) %>%
   arrange(z, element)


#       z element  counts Ca_ratio Fe_ratio Ti_ratio
#   <int> <chr>     <dbl>    <dbl>    <dbl>    <dbl>
# 1     1 Ag      -0.112   -0.733    -0.197   -1.51 
# 2     1 Ca       0.153    1         0.269    2.06 
# 3     1 Fe       0.570    3.72      1        7.66 
# 4     1 Ti       0.0743   0.485     0.130    1    
# 5     2 Ag       0.881    0.406    -6.52    -1.49 
# 6     2 Ca       2.17     1       -16.1     -3.69 
# 7     2 Fe      -0.135   -0.0622    1        0.229
# 8     2 Ti      -0.590   -0.271     4.37     1    
# 9     3 Ag       0.398    0.837     0.166   -0.700
#10     3 Ca       0.476    1         0.198   -0.836
# ... with 30 more rows
Sign up to request clarification or add additional context in comments.

2 Comments

That's a good solution. Needed to read up on tidyeval first and I'm still not sure when exactly to use :=. Thanks!
You can use := with !! when you need to unquote the name on the lhs, such as when using paste to create the name
1

Using base R

proxies <- c("Ca", "Fe", "Ti")

for(f in proxies){
   newDF <- as.data.frame(df$counts/df$counts[df$element %in% f])
   names(newDF) <- paste(f, "ratio", sep = "_")
   df <- cbind(df,newDF)
}

> df
    z element      counts    Ca_ratio    Fe_ratio    Ti_ratio
1   1      Ag -0.40163072 -0.35820754   1.7375395  0.45692965
2   2      Ag -1.00880171  1.27798430  22.8520332 -2.84599471
3   3      Ag  0.72230855 -1.19506223   6.3893485 -0.73558507
4   4      Ag -1.71524002 -1.38942436   1.7564861 -3.03313134
5   5      Ag -0.30813737  1.08127226   4.1985801 -0.33008370
6   6      Ag  0.20524663  0.08910397  -0.3132916 -0.23778331
...

2 Comments

Thank you for this solution. I would like to use tidy R / code that I can use in a pipe. Any ideas how to do that?
I do not use the tidyverse, so I won't be of much help. I would probably try to wrap map into sapply. something like (%>% sapply(proxies, function(t) map(...)))

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.