0

I am trying to make a composite plot in R using the packages ggplot2 and ggpubr. I have no problem in making the composite plots except each plot has a normal distribution curve specific to that dataset. When I generate the composite plot, both plots have the same curve, that of the last dataset.

How can I generate the composite plot with each plot having its own specific normal distribution curve?

CODE AND OUTPUT PLOTS

## PLOT 1 ##

results_matrix_C <- data.frame(matrix(rnorm(20), nrow=20))
colnames(results_matrix_C) <- c("X")

m <- mean(results_matrix_C$X)
sd <- sd(results_matrix_C$X)
dnorm_C <- function(x){
  norm_C <- dnorm(x, m, sd)
  return(norm_C)
}

e = 1
dnorm_one_sd_C <- function(x){
  norm_one_sd_C <- dnorm(x, m, sd)
  # Have NA values outside interval x in [e]:
  norm_one_sd_C[x <= e] <- NA
  return(norm_one_sd_C)
}


C <- ggplot(results_matrix_C, aes(x = results_matrix_C$X)) +
  geom_histogram(aes(y=..density..), bins = 10, colour = "black", fill = "white") +
  stat_function(fun = dnorm_one_sd_C, geom = "area", fill = "#CE9A05", color = "#CE9A05", alpha = 0.25, size = 1) +
  stat_function(fun = dnorm_C, colour = "#CE0539", size = 1) +
  theme_classic()

enter image description here

## PLOT 2 ##

results_matrix_U <- data.frame(matrix(rnorm(20)+1, nrow=20))
colnames(results_matrix_U) <- c("X")

m <- mean(results_matrix_U$X)
sd <- sd(results_matrix_U$X)
dnorm_U <- function(x){
  norm_U <- dnorm(x, m, sd)
  return(norm_U)
}

e = 2
dnorm_one_sd_U <- function(x){
  norm_one_sd_U <- dnorm(x, m, sd)
  # Have NA values outside interval x in [e]:
  norm_one_sd_U[x <= e] <- NA
  return(norm_one_sd_U)
}


U <- ggplot(results_matrix_U, aes(x = results_matrix_U$X)) +
  geom_histogram(aes(y=..density..), bins = 10, colour = "black", fill = "white") +
  stat_function(fun = dnorm_one_sd_U, geom = "area", fill = "#CE9A05", color = "#CE9A05", alpha = 0.25, size = 1) +
  stat_function(fun = dnorm_U, colour = "#CE0539", size = 1) +
  theme_classic()

enter image description here

library(ggpubr)

ggarrange(C, U,
          nrow = 1, ncol = 2)

enter image description here

As you can see in the composite plot, the first one has taken the normal distribution curve of the second plot rather than its own one from my initial plot (Plot 1).

UPDATE

Variable "e" refers to the shaded area which is related to the distribution curve. m = mean of the dataset sd = standard deviation of the dataset m and sd are used to generate the normal distribution curves

3
  • I think the issue has nothing to do with plots, and everything to do with the way you define your functions. dnorm_C takes only x as an argument, but it also uses m and sd. You may need to force them, but better practice would be to pass them in explicitly - skimming your question I'm not really sure what values you want to use (and clearly R isn't sure either). Good reading on this topic is the functional operators section of Advanced R. The dnorm_one_sd_C is even worse, it uses a constant e that I don't see defined anywhere. Commented Jun 27, 2018 at 14:38
  • @Gregor, thank you for the response. I have updated the issue you had with dnorm_one_sd_C. If the problem is with the function, could you please provide a worked example so that I can fix the problem I have. Commented Jun 27, 2018 at 15:04
  • 2
    The issue is not that "you don't explain what e is", the issue is that your functions use variables like e and m and sd that are not passed in as arguments. My advice is that you should rewrite your dnorm_one_sd_C <- function(x) as dnorm_one_sd_C <- function(x, m, sd) and rewrite dnorm_one_sd_U <- function(x) as dnorm_one_sd_U <- function(x, m, sd, e). If you want explanation of why this is a problem and this advice is needed, read the link I posted in my first comment - it is too complicated to explain well here. Commented Jun 27, 2018 at 15:09

1 Answer 1

1

SOLVED

By inserting the function in full into the stat_function section of the ggplot2 code, this has worked

i.e:

## PLOT 1 ##

results_matrix_C <- data.frame(matrix(rnorm(20), nrow=20))
colnames(results_matrix_C) <- c("X")

mean <- mean(results_matrix_C$X)
sd <- sd(results_matrix_C$X)
e = 1


C <- ggplot(results_matrix_C, aes(x = results_matrix_C$X)) +
  geom_histogram(aes(y=..density..), bins = 10, colour = "black", fill = "white") +
  stat_function( 
    fun = function(x, mean, sd, e){ 
      norm_one_sd_C <- dnorm(x, mean, sd)
      norm_one_sd_C[x <= e] <- NA
  return(norm_one_sd_C)}, 
    args = c(mean = mean, sd = sd, e = e), geom = "area", fill = "#CE9A05", color = "#CE9A05", alpha = 0.25, size = 1) +
  stat_function( 
    fun = function(x, mean, sd){ 
      dnorm(x = x, mean = mean, sd = sd)}, 
    args = c(mean = mean, sd = sd), colour = "#CE0539", size = 1) +
  theme_classic()

enter image description here

## PLOT 2 ##

results_matrix_U <- data.frame(matrix(rnorm(20)+1, nrow=20))
colnames(results_matrix_U) <- c("X")

mean <- mean(results_matrix_U$X)
sd <- sd(results_matrix_U$X)
e = 2


U <- ggplot(results_matrix_U, aes(x = results_matrix_U$X)) +
  geom_histogram(aes(y=..density..), bins = 10, colour = "black", fill = "white") +
  stat_function( 
    fun = function(x, mean, sd, e){ 
      norm_one_sd_U <- dnorm(x, mean, sd)
      norm_one_sd_U[x <= e] <- NA
  return(norm_one_sd_U)}, 
    args = c(mean = mean, sd = sd, e = e), geom = "area", fill = "#CE9A05", color = "#CE9A05", alpha = 0.25, size = 1) +
  stat_function( 
    fun = function(x, mean, sd){ 
      dnorm(x = x, mean = mean, sd = sd)}, 
    args = c(mean = mean, sd = sd), colour = "#CE0539", size = 1) +
  theme_classic()

enter image description here

library(ggpubr)

ggarrange(C, U,
          nrow = 1, ncol = 2)

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.