1

I'm trying to write a ggplot for-loop to no success. Essentially, I'm trying to make a scatter plot, according to the aminoacid (so essentially 22 different scatter plots only containing the values for said aminoacid). Instead, I'm getting every every value being plotted in every output plot.

The data file looks like this:

dput(head(df_melt_differentials))

structure(list(codon = c("AAA", "AAC", "AAG", "AAT", "ACA", "ACC"
), Fed_differential_cutoff0.5 = c(0.405320284943889, 0.538603465353382, 
0.594679715056111, 0.461396534646618, 0.279723500180007, 0.350047954876902
), Fed_differential_cutoff0 = c(0.400929382467845, 0.541230665098641, 
0.599070617532155, 0.458769334901359, 0.281177150483858, 0.351472083384939
), Fed_differential_cutoff1 = c(0.389691692491739, 0.572371186663778, 
0.610308307508261, 0.427628813336222, 0.258694141571916, 0.371346938275356
), Fed_differential_cutoff2 = c(0.376102000883263, 0.543866386823925, 
0.623897999116737, 0.456133613176075, 0.240118371752021, 0.371624132164088
), Starved_differential_cutoff0.5 = c(0.35341548435504, 0.612764761460883, 
0.64658451564496, 0.387235238539117, 0.241749339598093, 0.401216490580919
), Starved_differential_cutoff0 = c(0.351704818898789, 0.613092767267543, 
0.648295181101211, 0.386907232732457, 0.242028282002779, 0.398227680007641
), Starved_differential_cutoff1 = c(0.351258676092076, 0.616216524001233, 
0.648741323907924, 0.383783475998767, 0.236979413320061, 0.417121137360074
), Starved_differential_cutoff2 = c(0.330195165073707, 0.631859350667716, 
0.669804834926293, 0.368140649332284, 0.226783649173637, 0.440433256347991
), AA = c("K", "N", "K", "N", "T", "T"), full_amino = c("Lysine", 
"Asparagine", "Lysine", "Asparagine", "Threonine", "Threonine"
), aminoacid = c("Lys", "Asn", "Lys", "Asn", "Thr", "Thr"), wobble = c("AT_wobble", 
"GC_wobble", "GC_wobble", "AT_wobble", "AT_wobble", "GC_wobble"
), wobble_single = c("A_wobble", "C_wobble", "G_wobble", "T_wobble", 
"A_wobble", "C_wobble")), row.names = c(NA, 6L), class = "data.frame")

My loop is:

for (aminoacid in df_melt_differentials$aminoacid) {
  
  cutoff0_gingold_loop <- ggplot(df_melt_differentials, aes(x=Fed_differential_cutoff0, y= Starved_differential_cutoff0)) +
    geom_point(aes(color = wobble)) +
    theme_bw(base_size = 16)+
    labs(title = paste(aminoacid, "RSCU of Differential Genes (Log2FC cutoff = 0)")) +
    geom_abline(slope = 1, intercept = 0, linetype= "dashed")
  
  cutoff0_gingold_loop +
    geom_label_repel(aes(label = codon),
                     box.padding   = 0.35, 
                     point.padding = 0.5,
                     segment.color = 'grey50') +
    theme_classic()
  
    ggsave(filename = paste(aminoacid, "RSCU_FvS_differential_cutoff0_gingold.png", sep = "_"), bg = "white", width = 7, height = 7, dpi = 600)
}

I know it's probably a silly mistake but I can't seem to figure out where I've gone wrong.

I also have a secondary question but I'm not too bothered if this isn't answered; In the end, I normally have 4 different scatter plots according to the 4 different cutoffs I have (0, 0.5, 1 and 2). Is there a way to incorporate this into the loop? Ideally, I'd like to have Fed_differential_cutoff0 vs Starved_differential_cutoff0 (for each individual aminoacid), and the same for cutoff0.5/cutoff1/cutoff2.

Thanks in advance!

2
  • 1
    consider adding dput(head(df_melt_differentials)) to your reprex share the machine readable exact data so people don't have to parse a text table to reproduce your plots Commented Jun 22, 2020 at 15:13
  • As to your secondary question - yeah, you could do a nested loop, or you could convert your data to a long format (see this FAQ) and use facets - this would probably be nicer, giving your all four cutoffs as subplots. Commented Jun 22, 2020 at 15:54

1 Answer 1

1

You don't have a subset anywhere. I would rewrite as:

for (this_aminoacid in unique(df_melt_differentials$aminoacid)) {
  
  cutoff0_gingold_loop <- ggplot(
    data = subset(df_melt_differentials, aminoacid == this_aminoacid),
    aes(x=Fed_differential_cutoff0, y= Starved_differential_cutoff0)
  ) +
    geom_point(aes(color = wobble)) +
    theme_bw(base_size = 16)+
    labs(title = paste(this_aminoacid , "RSCU of Differential Genes (Log2FC cutoff = 0)")) +
    geom_abline(slope = 1, intercept = 0, linetype= "dashed")
  
  cutoff0_gingold_loop +
    geom_label_repel(aes(label = codon),
                     box.padding   = 0.35, 
                     point.padding = 0.5,
                     segment.color = 'grey50') +
    theme_classic()
  
    ggsave(filename = paste(this_aminoacid, "RSCU_FvS_differential_cutoff0_gingold.png", sep = "_"), bg = "white", width = 7, height = 7, dpi = 600)
}

I have

  • added subset to tell R which data to use each time
  • changed the name of the looping variable to this_aminoacidfor clarity
  • Looped over unique(df_melt_differentials$aminoacid) so each value is only used once instead of however many times it shows up in your data
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.