2

I'm trying to use the R packages crosstalk and plotly (along with tidyverse packages) to create a panels that displays multiple data sets together. There's two data frames that I want to put into one plot; here's some code that creates data sets with some of these key features:

library(tf)
library(tidyverse)
library(tidyfun)  # pak::pak("tidyfun/tidyfun") to install
library(ggplot2)
library(plotly)
library(crosstalk)

df1 <- tibble(f = tf_rgp(6), fil1 = rep(c("A", "B"), each = 3),
              fil2 = rep(c("C", "D", "E"), times = 2)) %>%
  tf_unnest(f) %>%
  rename(arg = f_arg, value = f_value)

df2 <- tibble(f = tf_rgp(2), fil3 = c("U", "V")) %>%
  tf_unnest(f) %>%
  rename(arg = f_arg, value = f_value)

I want the data in df1 and df2 to be put on a common plot. I also want to be able to filter the subsets of both df1 and df2 that I see. Here was an initial attempt, using ggplotly (my preferred approach; for some reason, it seems more performant than plot_ly in the actual situation I am working on).

df1_selector <- highlight_key(df1)
df1_widgets <- bscols(widths = c(12, 12),
                      filter_checkbox("fil1", "Filter 1", df1_selector,
                                      ~ fil1),
                      filter_checkbox("fil2", "Filter 2", df1_selector,
                                      ~ fil2))

df2_selector <- highlight_key(df2)
df2_widgets <- bscols(widths = 12,
                      filter_checkbox("fil3", "Filter 3", df2_selector,
                                      ~ fil3))
# Using ggplotly
bscols(widths = c(2, 2, 8),
       df1_widgets,
       df2_widgets,
       (ggplot(df1_selector, aes(x = arg, y = value, color = fil1)) +
          geom_line(aes(color = fil1, linetype = fil2,
                        group = interaction(fil1, fil2))) +
          geom_line(data = df2_selector,
                    aes(x = arg, y = value, color = fil3, group = fil3))) %>%
         ggplotly
)

Here's the resulting interface. It looks like it could work.

ggplotly attempt

But it does not work correctly; filters are not choosing the correct subsets. For example, if I restrict my selection to U in the second data set, I only see that data set, when I should see that data set and everything in df1. There's other ways that it does not work that I don't need to detail exhaustively.

ggplotly attempt does not work

Using plot_ly does not give correct behavior either, but as mentioned above, I don't want to use plot_ly because for some reason it has bad performance in the application I actually care about.

# Using plot_ly
bscols(widths = c(2, 2, 8),
       df1_widgets,
       df2_widgets,
       plot_ly(df1_selector, x = ~ arg, y = ~ value) %>%
         add_lines(colors = ~ fil1, linetype = ~ fil2,
                   split = ~ interaction(fil1, fil2)) %>%
         add_lines(data = df2_selector, colors = ~ fil3, split = ~ fil3)
)

I tried asking ChatGPT for insight and ChatGPT said that the problem is that you can't use multiple data sets with plotly and crosstalk as I am attempting to do here. It gave bad solutions, but I did attempt a solution that combines the data sets together into one data set. Here is the result. It unfortunately does not work either.

df_joined_selector <- bind_rows(mutate(df1, dat = "df1"),
                                mutate(df2, dat = "df2")) %>%
  highlight_key
df_joined_widgets <- bscols(widths = rep(12, 3),
                            filter_checkbox("fil1", "Filter 1",
                                            df_joined_selector,
                                            ~ fil1),
                            filter_checkbox("fil2", "Filter 2",
                                            df_joined_selector,
                                            ~ fil2),
                            filter_checkbox("fil3", "Filter 3",
                                            df_joined_selector,
                                            ~ fil3))

bscols(widths = c(4, 8),
       df_joined_widgets,
       (ggplot(df_joined_selector, aes(x = arg, y = value,
                                       group = interaction(fil1, fil2, fil3, dat))) +
          geom_line(aes(color = fil1, linetype = fil2))) %>%
         ggplotly
       )

joined dataframe attempt

In addition to the plot not being right, the subsetting is still incorrect; if I select U for Filter 1 I still will only see the U series, when I want to see everything.

So, how can I get a crosstalk and plotly dashboard that does the kind of subsetting I wish to do?

1 Answer 1

0

So, how can I get a crosstalk and plotly dashboard that does the kind of subsetting I wish to do?

This is a tight set of requirements. It does depend on what you want to do with the resulting widget. It would be very straight forward using a Shiny app, but I assume you don't want to go there; maybe you want to distribute a widget.

Usually crosstalk is used to work on one common dataset, so that's basically what this hacky solution does:

  • Create common variables fil1, fil2, fil3 and rbind df1 & df2 to shared_data
  • Create the plotly with shared_data and remove NA-traces afterwards (hacky part)
  • assign keys from other df-rows to checkboxes vice-versa

Crosstalk basically assigns keys to each row of the underlying sharedData. Ticking the checkboxes on runtime just subsets using a keymap stored in

crosstalk::filter_checkbox(...)[["children"]][[3]][["children"]][[1]]

{
  "map": {
    "U": ["307", "308", "309", "310", "311", "312", "313", "314", "315", "316", "317", "318", "319", "320", "321", "322", "323", "324", "325", "326", "327", "328", "329", "330", "331", "332", "333", "334", "335", "336", "337", "338", "339", "340", "341", "342", "343", "344", "345", "346", "347", "348", "349", "350", "351", "352", "353", "354", "355", "356", "357"],
    "V": ["358", "359", "360", "361", "362", "363", "364", "365", "366", "367", "368", "369", "370", "371", "372", "373", "374", "375", "376", "377", "378", "379", "380", "381", "382", "383", "384", "385", "386", "387", "388", "389", "390", "391", "392", "393", "394", "395", "396", "397", "398", "399", "400", "401", "402", "403", "404", "405", "406", "407", "408"]
  },
  "group": ["SharedDataec7df9a3"] # this group-id seems to be assigned at random 
  # it's the shared data identifier that will be subsetted
} 

See, normally selecting "U" would only select the rows 307:347 because these are the rows where fil3 == "U". But, you want this filter to only be applied to df2, thus we need to also include all keys from df1 so that df1 stays unfiltered. This needs to be done for all checkboxes.

library(plotly)
library(tf)
library(tidyfun)
library(crosstalk)
set.seed(234027236) # tf_rgp uses RNG

df1 <- tibble::tibble(f = tf_rgp(6), fil1 = rep(c("A", "B"), each = 3),
                      fil2 = rep(c("C", "D", "E"), times = 2)) |>
  tf_unnest(f) |>
  rename(arg = f_arg, value = f_value) |>
  transform(fil3 = NA)

df2 <- tibble::tibble(f = tf_rgp(2), fil3 = c("U", "V")) |>
  tf_unnest(f) |>
  rename(arg = f_arg, value = f_value) |>
  transform(fil1 = NA, fil2 = NA)

binded_df <- rbind(df1, df2)
shared_data <- highlight_key(binded_df)

# Helper to extend keymap
extend_keymap <- \(checkbox, extra_keys) {
  keymap <- jsonlite::fromJSON(checkbox[["children"]][[3]][["children"]][[1]])
  new_child <- list(
    map = lapply(keymap$map, \(x) c(extra_keys, x)),
    group = keymap$group
  )
  checkbox[["children"]][[3]][["children"]][[1]] <- jsonlite::toJSON(new_child)
  checkbox
}

df1_keys <- as.character(which(is.na(binded_df$fil3))) 
df2_keys <- as.character(which(!is.na(binded_df$fil3)))

cb1_1 <- filter_checkbox("fil1", "Filter 1", shared_data, ~ fil1) |> extend_keymap(df2_keys)
cb1_2 <- filter_checkbox("fil2", "Filter 2", shared_data, ~ fil2) |> extend_keymap(df2_keys)
cb_2 <- filter_checkbox("fil3", "Filter 3", shared_data, ~ fil3) |> extend_keymap(df1_keys)


p <- ggplotly(ggplot(shared_data, aes(x = arg, y = value, color = fil1)) +
                geom_line(aes(color = fil1, linetype = fil2,
                              group = interaction(fil1, fil2))) +
                geom_line(aes(x = arg, y = value, color = fil3, group = fil3)))

# remove NA-traces
p$x$data <- p$x$data[!unlist(lapply(p$x$data, \(x) grepl("NA", x$name)))]

bscols(widths = c(2, 8), list(cb1_1, cb1_2, cb_2), p)

res

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.