1

I am trying to filter a data frame based on user input via a JSON file parsed into the script.

Given the following inputs:

#tibble to be filtered:

> database
# A tibble: 11 x 6
   strain genotype floodmedia inductionDelay treatment timePoint
   <lgl>  <chr>    <chr>               <dbl> <chr>     <chr>    
 1 NA     CBF1     SCGal                  15 EtoH      -1       
 2 NA     CBF1     SCGal                  15 Estradiol -1       
 3 NA     CBF1     SCGal                  15 EtoH      15       
 4 NA     CBF1     SCGal                  15 Estradiol 15       
 5 NA     CBF1     SCGal                  15 EtoH      90       
 6 NA     CBF1     SCGal                  15 Estradiol 90       
 7 NA     CBF1     SCGal                  15 EtoH      -1       
 8 NA     CBF1     SCGal                  15 Estradiol -1       
 9 NA     CBF1     SCGal                  15 EtoH      15       
10 NA     CBF1     SCGal                  15 Estradiol 15       
11 NA     CBF1     SCGal                  15 EtoH      90   

#JSON Input:

{
    "timePoint":["15", "-1"],
    "treatment":["EtoH"]
}

I would like to return:

# A tibble: 4 x 6
  strain genotype floodmedia inductionDelay treatment timePoint
  <lgl>  <chr>    <chr>               <dbl> <chr>     <chr>    
1 NA     CBF1     SCGal                  15 EtoH      -1       
2 NA     CBF1     SCGal                  15 EtoH      15       
3 NA     CBF1     SCGal                  15 EtoH      -1       
4 NA     CBF1     SCGal                  15 EtoH      15     

I need the solution to be general in terms of the variable names so that given a tibble with different variable names and a JSON matching those new names, the code would still properly filter the new tibble. As such, I have tried to use a for loop to iterate through the parsed json object, but have been unsuccessful thus far.

#code I have tried

json <- read_json("filepath_to_json_file_shown_above")
query <- database
for(i in names(json)){
  query <- query %>% filter(i %in% json$i)
}

I am using the packages tidyverse/dplyr and jsonlite to do the above tasks, though I do not mind using a different package if that is more useful.

Thank you for any help and I apologize if I have made mistakes in my question. This is my first time using StackOverflow. Please let me know if anything is unclear or incorrect.

1 Answer 1

0

I cannot say I am confident in this, but it seems like it might work. I recently came across another post that described a function for filtering on multiple conditions with tidyverse. Perhaps you could adapt something like this below. Of note, this ignores variable types (numeric vs. character).

library(jsonlite)
library(tidyverse)

json_df <- fromJSON(json_file)

df <- unnest(json_df, cols = everything())

my_filter <- function(df, cols, conds){     
  fp <- map2(cols, conds, function(x, y) quo((!!(as.name(x))) %in% !!y))
  filter(df, !!!fp)
}

my_filter(database, cols = as.list(names(df)), conds = as.list(df))

Output

  strain genotype floodmedia inductionDelay treatment timePoint
1     NA     CBF1      SCGal             15      EtoH        -1
2     NA     CBF1      SCGal             15      EtoH        15
3     NA     CBF1      SCGal             15      EtoH        -1
4     NA     CBF1      SCGal             15      EtoH        15

Data

database <- read.table(text = 
"strain genotype floodmedia inductionDelay treatment timePoint
 NA     CBF1     SCGal                  15 EtoH      -1       
 NA     CBF1     SCGal                  15 Estradiol -1       
 NA     CBF1     SCGal                  15 EtoH      15       
 NA     CBF1     SCGal                  15 Estradiol 15       
 NA     CBF1     SCGal                  15 EtoH      90       
 NA     CBF1     SCGal                  15 Estradiol 90       
 NA     CBF1     SCGal                  15 EtoH      -1       
 NA     CBF1     SCGal                  15 Estradiol -1       
 NA     CBF1     SCGal                  15 EtoH      15       
 NA     CBF1     SCGal                  15 Estradiol 15       
 NA     CBF1     SCGal                  15 EtoH      90", header = T)

json_file <- 
'[{
  "timePoint":["15", "-1"],
  "treatment":["EtoH"]
}]'
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for your help, Ben. When I tried your code, the line df <- unnest(json_df, cols = everything()) gave me the following error: Error in UseMethod("unnest") : no applicable method for 'unnest' applied to an object of class "list". Basically, it is reading the JSON file in as a list instead of a data frame, and unnest can only take in data frames as arguments is what I concluded. I looked around for an answer but wasn't successful. Do you have thoughts?
Nevermind, I figured it out and your method seems to work. Thanks for your help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.