0

I am creating a grouped boxplot in plotly using predefined quantiles. I want to color the lines around the boxes based on a separate variable. I can't seem to do this directly in my plotly call. There is a nice solution here which involves changing the line colours as a post-processing step using plotly_build. The example in that link works well but the structure of the data is different when using pre-defined quantiles, and I can't seem to access the data as in that example. Perhaps there is a way but I can't figure it out.

My attempted solution is the following, which involves adding a new trace, but with transparent fill as follows:

library(plotly)
library(dplyr)

# CREATE DUMMY DATA
set.seed(123) # Set seed for reproducibility

site_name <- rep(paste0("site_", 1:5), each = 40) # Create the site_name column with 5 different site names, each with 20 rows
site_type <- rep(c("A", "B"), each = 20, times = 5) # Create the site_type column with 10 'A's and 10 'B's for each site
value <- runif(100, min = 0, max = 200) # Create the value column with random numbers (between 0 and 100).
# Combine into a data frame
df <- data.frame(site_name, site_type, value)

# Generate site_status randomly for each combination of site_name and site_type
unique_combinations <- unique(df[c("site_name", "site_type")])
unique_combinations$site_status <- sample(c("Good", "Bad"), nrow(unique_combinations), replace = TRUE)
# Merge site_status back to the original df
df <- df %>%
  left_join(unique_combinations, by = c("site_name", "site_type"))

# Display the first few rows of the dataset
head(df, 20)

# MAKE SUMMARY DATA

# Group by site_name and site_type, then calculate summary statistics
stats_df <- df %>%
  group_by(site_name, site_type) %>%
  summarise(
    lower_fence = quantile(value, probs = c(0.05), type = 5, na.rm = TRUE),
    q1 = quantile(value, probs = c(0.25), type = 5, na.rm = TRUE),
    median = quantile(value, probs = c(0.5), type = 5, na.rm = TRUE),
    mean = mean(value, na.rm = TRUE),
    q3 = quantile(value, probs = c(0.75), type = 5, na.rm = TRUE),
    upper_fence = quantile(value, probs = c(0.95), type = 5, na.rm = TRUE),
    sd = sd(value, na.rm = TRUE),
    site_status = unique(site_status),
    .groups = 'drop'
  )

# PLOTTING CODE

# make box plot
fig <- plot_ly(data = stats_df, 
               x = ~site_name, 
               color = ~site_type,   # boxes
               colors = c("blue","red"), 
               type = "box",
               lowerfence = ~lower_fence, 
               q1 = ~q1, 
               median = ~median,
               q3 = ~q3, 
               upperfence = ~upper_fence) %>%
  layout(boxmode = "group", boxgap = 1/5)

# Filter out the boxes to be drawn with green boxes around them
bad_data <- stats_df %>% filter(site_status == "Bad")

# Add green boxes
fig <- fig %>% plotly::add_trace(
  x = factor(bad_data$site_name),
  color = factor(bad_data$site_type),
  colors = c("blue","red"),
  type = "box",
  lowerfence = bad_data$lower_fence,
  q1 = bad_data$q1,
  median = bad_data$median,
  q3 = bad_data$q3,
  upperfence = bad_data$upper_fence,
  line = list(color = "green"),
  fillcolor = "rgba(255,0,0,0.0)", # Red with transparency
  boxmean = FALSE, # Avoid adding box means
  showlegend = TRUE,
  inherit = FALSE
)

# Show the figure
fig

This works like a charm when the data aren't grouped, but unfortunately when using a grouped boxplot it creates the box outlines but as separate grouped items and changes the colors in the existing boxes as in this image:

enter image description here

I'm not sure if the grouping attribute can be forced somehow. TBH I'm not really sure if this approach is viable.

I would love to be able to change the line colors using the following kind of approach as per the example referenced above, but it doesn't seem to work with pre-defined quantiles:

built_fig <- plotly_build(built_fig)

lapply(1:length(stats_df$site_status),
       function(i){
         nm = stats_df$site_status[i]
         cr = ifelse(nm == "Good",
                     "#66FF66", "black")
         built_fig$x$data[[i]]$line$color <<- cr  # change graph by age
       }
)

Any suggestions greatly appreciated.

1 Answer 1

1

Here is one option which uses four traces, i.e. one trace for each combo of site type and status, and the offsetgroup= attribute to create your desired result without the need of manipulating the plotly object.

library(plotly)

plot_ly(
  data = stats_df |> head(0),
  lowerfence = ~lower_fence,
  q1 = ~q1,
  median = ~median,
  q3 = ~q3,
  upperfence = ~upper_fence,
  x = ~site_name,
  offsetgroup = ~site_type,
  color = ~site_type, # boxes
  colors = c("blue", "red"),
  type = "box"
) |>
  plotly::add_trace(
    data = stats_df %>% filter(site_status == "Bad", site_type == "A"),
    line = list(color = "green"),
    showlegend = FALSE,
    legendgroup = "A"
  ) |>
  plotly::add_trace(
    data = stats_df %>% filter(site_status == "Bad", site_type == "B"),
    line = list(color = "green"),
    showlegend = FALSE,
    legendgroup = "B"
  ) |>
  plotly::add_trace(
    data = stats_df %>% filter(site_status != "Bad", site_type == "A"),
    line = list(color = "black"),
    legendgroup = "A"
  ) |>
  plotly::add_trace(
    data = stats_df %>% filter(site_status != "Bad", site_type == "B"),
    line = list(color = "black"),
    legendgroup = "B"
  ) |>
  layout(boxmode = "group")

Sign up to request clarification or add additional context in comments.

1 Comment

That's a really elegant solution. Thanks so much for your work @Stefan. One final question. I should have added overlaying points in my reprex,. when I do like this: |> plotly::add_markers( data = df,x = ~site_name, y = ~value, color = ~site_type, colors = c("blue", "red"), inherit = FALSE ) the alignment of points is messed up. This solution uses a fixer() function to help but it doesn't work here.Possibly too tangential for a follow up comment.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.