1

I have a histogram which I want to facet based on three separate factors. I then want to add two lines of text in the top right-hand corner of each facet. The text is data dependent, and will be different for every subset of the data.

Here I use the Heart Attack Analysis data on Kaggle. I download and unzip the data, then read-in the heart.csv. I separate out the data into 3 factors (sex, slp, exng) and get the maximum and minimum ages within each subset. Then I plot the ages by factor in ggplot. I want the maximum and minimum ages in the top right-hand corner of the plot. But I could only figure out how to do this in a single plot (without the facet grid step).

Here's the code:

# Load data
heart <- read.csv(file = 'C:/FilePath/heart.csv')


# Split data into subsets based on our three factors
hrt_grps <- heart %>%
  group_split(sex,slp,exng)

# Get the max and min within each subset (and some other stats as well)
hrt_grps_smry <- tibble::tibble()
colnames(hrt_grps_smry) <- c("sex","slp","exng", "max_d", "min_d", 
                              "mean_d","t.p-val","t.conf.L","t.conf.U")


# Iterate through every element in the hrt_grps group-split and populate the rows of hrt_grps_smry df
for(i in 1:length(hrt_grps)){
  t.tst <- t.test(x = hrt_grps[[i]]$age,
                  alternative = "two.sided")
  
  hrt_grps_smry[i,"sex"]       <- hrt_grps[[i]]$sex[1]
  hrt_grps_smry[i,"slp"]       <- hrt_grps[[i]]$slp[1]
  hrt_grps_smry[i,"exng"]      <- hrt_grps[[i]]$exng[1]
  hrt_grps_smry[i,"max_d"]     <- max(hrt_grps[[i]]$age)
  hrt_grps_smry[i,"min_d"]     <- min(hrt_grps[[i]]$age)
  hrt_grps_smry[i,"mean_d"]    <- mean(hrt_grps[[i]]$age)
  hrt_grps_smry[i,"t.p-val"]   <- t.tst$p.value
  hrt_grps_smry[i,"t.conf.L"]  <- t.tst$conf.int[[1]]
  hrt_grps_smry[i,"t.conf.U"]  <- t.tst$conf.int[[2]]

}

# Plot single histogram with max and min in top right-hand corner (successful):
heart %>%
  # This line is because in my real data it is very important that I control the order of the facets
  mutate(across(slp,factor, levels = c(2,0,1))) %>%
  ggplot(aes(x=age)) +
  geom_histogram(bins = 35) +
 # facet_grid(sex ~ slp ~ exng) +
  geom_text(
    data = hrt_grps_smry, 
    aes(x=5, y = median(density(heart$age)$y)), 
    label = max(hrt_grps_smry$max_d), vjust = -35, hjust = -40, 
    size = 4, angle = 0, colour = "gray10") +
  geom_text(
    data = hrt_grps_smry, 
    aes(x=5, y = median(density(heart$age)$y)), 
    label = min(hrt_grps_smry$min_d), vjust = -32, hjust = -40, 
    size = 4, angle = 0, colour = "gray10") +
  ylab("Count")

# Plot facet-grid histogram of ages with the max and min in the top right-hand corner
heart %>%
  mutate(across(slp,factor, levels = c(2,0,1))) %>%
  ggplot(aes(x=age)) +
  geom_histogram(bins = 35) +
  facet_grid(sex ~ slp ~ exng) +
  geom_text(
    data = hrt_grps_smry, 
    aes(x=5, y = median(density(heart$age)$y)), 
    label = hrt_grps_smry$max_d[1], vjust = -4.1, hjust = -18, 
    size = 4, angle = 0, colour = "gray10") +
  geom_text(
    data = hrt_grps_smry, 
    aes(x=5, y = median(density(heart$age)$y)), 
    label = hrt_grps_smry$min_d[1], vjust = -2.8, hjust = -18, 
    size = 4, angle = 0, colour = "gray10") +
  ylab("Count")

I've only figured out how to grab the max and min values for the first subset. I have not figured out how to iterate through subsets and keep all plots in the same facet grid ggplot object.

1 Answer 1

1

The simplest way of doing this, as far as I know, is to add those min_d and max_d variables in the original dataset and then use them in geom_text

library(tidyverse)

# Load data
heart <- read.csv(file = 'test/heart.csv')

# calculate group wise max and min age and also create a group id variable 
# which will be used later to do merging
min_max_df <- heart %>% 
  group_by(sex, slp, exng) %>% 
  summarise(
    id = cur_group_id(),
    max_d = max(age),
    min_d = min(age),
    .groups = "drop"
  )

# merge the group wise min and max age with the main data by group id 
heart <- heart %>% 
  group_by(sex, slp, exng) %>% 
  mutate(
    id = cur_group_id()
  ) %>% 
  ungroup() %>% 
  left_join(
    min_max_df %>% select(id, max_d, min_d),
    by = "id"
  )


heart %>%
  mutate(across(slp,factor, levels = c(2,0,1))) %>% 
ggplot(aes(x=age)) +
  geom_histogram(bins = 35) +
  geom_text(aes(x = 72, y = 8, label = paste0("Max age: ", max_d)),
            size = 3, color = colorspace::lighten("black", amount = 0.5)) +
  geom_text(aes(x = 72, y = 6, label = paste0("Min age: ", min_d)),
            size = 3, color = colorspace::lighten("black", amount = 0.5)) +
  facet_grid(sex ~ slp ~ exng) 

text_on_facet_plot

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.