I have a histogram which I want to facet based on three separate factors. I then want to add two lines of text in the top right-hand corner of each facet. The text is data dependent, and will be different for every subset of the data.
Here I use the Heart Attack Analysis data on Kaggle. I download and unzip the data, then read-in the heart.csv. I separate out the data into 3 factors (sex, slp, exng) and get the maximum and minimum ages within each subset. Then I plot the ages by factor in ggplot. I want the maximum and minimum ages in the top right-hand corner of the plot. But I could only figure out how to do this in a single plot (without the facet grid step).
Here's the code:
# Load data
heart <- read.csv(file = 'C:/FilePath/heart.csv')
# Split data into subsets based on our three factors
hrt_grps <- heart %>%
group_split(sex,slp,exng)
# Get the max and min within each subset (and some other stats as well)
hrt_grps_smry <- tibble::tibble()
colnames(hrt_grps_smry) <- c("sex","slp","exng", "max_d", "min_d",
"mean_d","t.p-val","t.conf.L","t.conf.U")
# Iterate through every element in the hrt_grps group-split and populate the rows of hrt_grps_smry df
for(i in 1:length(hrt_grps)){
t.tst <- t.test(x = hrt_grps[[i]]$age,
alternative = "two.sided")
hrt_grps_smry[i,"sex"] <- hrt_grps[[i]]$sex[1]
hrt_grps_smry[i,"slp"] <- hrt_grps[[i]]$slp[1]
hrt_grps_smry[i,"exng"] <- hrt_grps[[i]]$exng[1]
hrt_grps_smry[i,"max_d"] <- max(hrt_grps[[i]]$age)
hrt_grps_smry[i,"min_d"] <- min(hrt_grps[[i]]$age)
hrt_grps_smry[i,"mean_d"] <- mean(hrt_grps[[i]]$age)
hrt_grps_smry[i,"t.p-val"] <- t.tst$p.value
hrt_grps_smry[i,"t.conf.L"] <- t.tst$conf.int[[1]]
hrt_grps_smry[i,"t.conf.U"] <- t.tst$conf.int[[2]]
}
# Plot single histogram with max and min in top right-hand corner (successful):
heart %>%
# This line is because in my real data it is very important that I control the order of the facets
mutate(across(slp,factor, levels = c(2,0,1))) %>%
ggplot(aes(x=age)) +
geom_histogram(bins = 35) +
# facet_grid(sex ~ slp ~ exng) +
geom_text(
data = hrt_grps_smry,
aes(x=5, y = median(density(heart$age)$y)),
label = max(hrt_grps_smry$max_d), vjust = -35, hjust = -40,
size = 4, angle = 0, colour = "gray10") +
geom_text(
data = hrt_grps_smry,
aes(x=5, y = median(density(heart$age)$y)),
label = min(hrt_grps_smry$min_d), vjust = -32, hjust = -40,
size = 4, angle = 0, colour = "gray10") +
ylab("Count")
# Plot facet-grid histogram of ages with the max and min in the top right-hand corner
heart %>%
mutate(across(slp,factor, levels = c(2,0,1))) %>%
ggplot(aes(x=age)) +
geom_histogram(bins = 35) +
facet_grid(sex ~ slp ~ exng) +
geom_text(
data = hrt_grps_smry,
aes(x=5, y = median(density(heart$age)$y)),
label = hrt_grps_smry$max_d[1], vjust = -4.1, hjust = -18,
size = 4, angle = 0, colour = "gray10") +
geom_text(
data = hrt_grps_smry,
aes(x=5, y = median(density(heart$age)$y)),
label = hrt_grps_smry$min_d[1], vjust = -2.8, hjust = -18,
size = 4, angle = 0, colour = "gray10") +
ylab("Count")
I've only figured out how to grab the max and min values for the first subset. I have not figured out how to iterate through subsets and keep all plots in the same facet grid ggplot object.
