0

Given the following data frame:

mydf <- data.frame(
    Treatment = c('T1', 'T1', 'T1', 'T1', 'T1', 'T1', 'T2', 'T2', 'T2', 'T2', 'T2', 'T2'),
    Observation = c('pH', 'pH', 'pH', 'RS', 'RS', 'RS', 'pH', 'pH', 'pH', 'RS', 'RS', 'RS'),
    Value = c(3.13, 3.21, 3.26, 19.20, 19.50, 9.70, 3.13, 3.40, 3.31, 11.00, 18.10, 7.50)
)

I need to generate a data frame where the rows are treatments, the columns are observations, and the values are strings referencing the mean and standard deviations of the relevant values. Here is some code which builds such a data frame:

mydf %>% group_by(Treatment, Observation) %>% 
  summarise(MeanSD = sprintf("%0.2f $\\pm$ %0.2f", mean(Value), sd(Value))) %>% 
  spread(Observation, MeanSD) %>% 
ungroup()

And here is the output of that code:

# A tibble: 2 x 3
  Treatment                 pH                  RS
*    <fctr>              <chr>               <chr>
1        T1 "3.20 $\\pm$ 0.07" "16.13 $\\pm$ 5.57"
2        T2 "3.28 $\\pm$ 0.14" "12.20 $\\pm$ 5.40"

I have now been told that I need to set the significant figures for those strings based on the observations. For the sake of argument, let's assume the pH mean and SD sig figs should be 2 and 2, respectively, while the RS mean and SD sig figs should be 0 and 1, respectively.

fmtStr <- list('pH'="%0.2f $\\pm$ %0.2f", 'RS'="%0.0f $\\pm$ %0.1f")

I tried this:

mydf %>% group_by(Treatment, Observation) %>% 
  summarise(MeanSD = sprintf(fmtStr[[Observation]], mean(Value), sd(Value))) %>% 
  spread(Observation, MeanSD) %>% 
ungroup()

And that generated this error:

Error in summarise_impl(.data, dots) : 
  Evaluation error: recursive indexing failed at level 2
.

What's the right incantation to achieve my goal?

1 Answer 1

1

You get that error because you can't extract from a list like that...

fmtStr[[mydf$Observation]]
# Error in fmtStr[[mydf$Observation]] : 
#   recursive indexing failed at level 2

You can subset the list with fmtStr[mydf$Observation] and convert it to a character vector with unlist(), but that still won't work in your summarise() command because you'll have a string for each observation within the group rather than just one for the summary value...

mydf %>% 
  group_by(Treatment, Observation) %>% 
  summarise(MeanSD = sprintf(unlist(fmtStr[Observation]), mean(Value), sd(Value)))
# Error in summarise_impl(.data, dots) : 
#   Column `MeanSD` must be length 1 (a summary value), not 3

Since your data is grouped by Observation, you can assume that every value of Observation will be the same within a group, and therefore just use the first value...

mydf %>% 
  group_by(Treatment, Observation) %>% 
  summarise(MeanSD = sprintf(fmtStr[Observation][[1]], mean(Value), sd(Value)))
# # A tibble: 4 x 3
# # Groups:   Treatment [?]
#   Treatment Observation MeanSD            
#   <fct>     <fct>       <chr>             
# 1 T1        pH          "3.20 $\\pm$ 0.07"
# 2 T1        RS          "16 $\\pm$ 5.6"   
# 3 T2        pH          "3.28 $\\pm$ 0.14"
# 4 T2        RS          "12 $\\pm$ 5.4"  

So your full code would look like...

mydf %>% 
  group_by(Treatment, Observation) %>% 
  summarise(MeanSD = sprintf(fmtStr[Observation][[1]], mean(Value), sd(Value))) %>% 
  spread(Observation, MeanSD) %>% 
  ungroup()
# # A tibble: 2 x 3
#   Treatment pH                 RS             
#   <fct>     <chr>              <chr>          
# 1 T1        "3.20 $\\pm$ 0.07" "16 $\\pm$ 5.6"
# 2 T2        "3.28 $\\pm$ 0.14" "12 $\\pm$ 5.4"
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.