1

I want to produce qqplots & lines on chem data that comes to me as an excel file with an unknown number of variables and an unknown number of observations then store each plot as a data object eg qqplot1, qqplot2, 3, 4, etc for later inclusion in a summary report. I'm writing a generic script to run on data sets as they come to me and the number and the name of the variables will vary.

  1. The first bit of script make a data frame (df) and this is pretty much how it looks to me after import from excel. This one has there variables (As, Ba, Cu) and the number of observations varies for each variable.
As = c(10, 20, 10, 12, 7, 14, 6, 9, 11, 15)
Ba = c(110, 120, 210, 112, 97, 214, 116, 211, 115, NA)
Cu = c(1, 1, 2, 11, 9, 21, 16, 19, NA, NA )
df = data.frame(As, Ba, Cu)

I can facet wrap all the variables by first pivoting then plotting. See code below. The pivot in R gives the columns generic names to the columns (name and value). This is Ok when there are only three variables but not so good if there are 20 or up to 50 variables.

Ideally, I would like to save each of the plots as objects that are sequentially numbered for later inclusion in summary HTML or PDF report.

Any ideas are welcome. RM

df_l = pivot_longer(df, cols = everything())

qqplot <- ggplot(data = df_l, mapping = aes(sample = value)) +
  stat_qq_band(alpha=0.5) +
  stat_qq_line() +
  stat_qq_point() +
  facet_wrap(~ name, scales = "free") +
  labs(x = "Theoretical Quantiles", y = "Sample Quantiles")
qqplot

PS: I was sort of going down this route but wanted it to be in ggplot and I have no idea how to save them sequentially numbered.

par(mfrow=c(1,1))
for (i in 1:ncol(df[,1: ncol(df) - 0 ])){  
  qqnorm(df[, i], main = names(df[i]))
  qqline(df[, i])
}

1 Answer 1

2

Maybe this is what you are looking for. Using e.g. purrr::imap (or lapply or ...) this could be achieved like so:

  1. Put your code for the qqplot inside a function

  2. Split you long df by name

  3. Use purrr::imap to loop over the splitted df

    • using imap has the advantage of passing the name of the split or the name of the variable to the function which makes it easy to add a title to the plot.
    • A second option to title your plots would be to keep the facet_wrap which will result in a facet like title for the plot

As a result you get a named list of qqplots:

As = c(10, 20, 10, 12, 7, 14, 6, 9, 11, 15)
Ba = c(110, 120, 210, 112, 97, 214, 116, 211, 115, NA)
Cu = c(1, 1, 2, 11, 9, 21, 16, 19, NA, NA )
df = data.frame(As, Ba, Cu)

library(ggplot2)
library(tidyr)
library(purrr)
library(qqplotr)

df_l = pivot_longer(df, cols = everything())

my_qqplot <- function(.data, .title) {
  ggplot(data = .data, mapping = aes(sample = value)) +
    stat_qq_band(alpha=0.5) +
    stat_qq_line() +
    stat_qq_point() +
    facet_wrap(~ name, scales = "free") +
    labs(x = "Theoretical Quantiles", y = "Sample Quantiles", title = .title)
}

qqplots <- df_l %>% 
  split(.$name) %>% 
  imap(my_qqplot)

qqplots$As # or qqplots[[1]]

Sign up to request clarification or add additional context in comments.

1 Comment

Brilliant. Have not used purr before. Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.