2

I have the following data in R.

oligo  condition  score
REF    Sample     27.827
REF    Sample     24.622
REF    Sample     31.042
REF    Competitor 21.066
REF    Competitor 18.413
REF    Competitor 36.164
ALT    Sample     75.465
ALT    Sample     57.058
ALT    Sample     66.408
ALT    Competitor 35.420
ALT    Competitor 17.652
ALT    Competitor 21.466

I have munged this and taken the averages of the scores for each condition using the group_by and summarise functions in dplyr.

emsa_test <- emsa_1 %>% 
  group_by(oligo,condition) %>%
  summarise_all(mean)

Creating the this table.

oligo  condition  score
ALT    Competitor 24.84600
ALT    Sample     66.31033
REF    Competitor 25.21433
REF    Sample     27.83033

I then plotted this using ggplot2.

ggplot(emsa_test, aes(oligo, score)) + 
geom_bar(aes(fill = condition), 
         width = 0.4, position = position_dodge(width=0.5), color = "black", stat="identity", size=.3) +  
theme_bw() +
ggtitle("CEBP\u03b1") +
theme(plot.title = element_text(size = 40, face = "bold", hjust = 0.5)) +
scale_fill_manual(values = c("#d8b365", "#f5f5f5"))

My issue is that I need to add error bars to the plot. The implementation would be similar to this.

geom_errorbar(aes(ymin=len-se, ymax=len+se), width=.1, position=pd)

However the after the data is munged, the max and min info contained in table 1 is lost. I could add the error bars manually but I have a few plots to plot so wonder if there is a way to retain this info through the pipeline.

Many Thanks.

3 Answers 3

5
library(tidyverse)

df <- read_table(
  "oligo  condition  score
REF    Sample     27.827
REF    Sample     24.622
REF    Sample     31.042
REF    Competitor 21.066
REF    Competitor 18.413
REF    Competitor 36.164
ALT    Sample     75.465
ALT    Sample     57.058
ALT    Sample     66.408
ALT    Competitor 35.420
ALT    Competitor 17.652
ALT    Competitor 21.466"
)

df %>%
  group_by(oligo, condition) %>%
  summarise(
    mean = mean(score),
    sd = sd(score),
    n = n(),
    se = sd / sqrt(n)
  ) %>%
  ggplot(aes(x = oligo, y = mean, fill = condition)) +
  geom_col(position = position_dodge()) +
  geom_errorbar(
    aes(ymin = mean - se, ymax = mean + se), 
    position = position_dodge2(padding = 0.5)
  ) +
  labs(
    title = "Mean Score ± 1 SE"
  )
#> `summarise()` has grouped output by 'oligo'. You can override using the
#> `.groups` argument.

Created on 2024-07-08 with reprex v2.1.0

Sign up to request clarification or add additional context in comments.

2 Comments

Many Thanks, that's what I was looking for.
Please just update your code to se = sd / sqrt(n)
1

You can summarize to more than one value and preserve min maxand mean:

emsa_test <- emsa_1 %>% 
  group_by(oligo,condition) %>%
  summarise(mean=mean(score),min=min(score),max=max(score))

Comments

1

Don't have enough reputation to comment, but just noticed a bug in JasonAizkalns' answer, in case someone else simply copies the code: se = sd/sqrt(n)

1 Comment

yes, it should be sd / sqrt(n), perhaps you would want to write the full code and explain this.. seems like a good additional answer

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.