62

I want to find multiple strings and put it in a variable, however I keep getting errors.

queries <- httpdf %>% filter(str_detect(payload, "create" || "drop" || "select"))
Error: invalid 'x' type in 'x || y'

queries <- httpdf %>% filter(str_detect(payload, "create" | "drop" | "select"))
Error: operations are possible only for numeric, logical or complex types

queries1 <- httpdf %>% filter(str_detect(payload, "create", "drop", "select"))
Error: unused arguments ("drop", "select")

None of these worked. Is there another way to do it with str_detect or should i try something else? I want them to show up as in the same column as well.

1
  • 15
    I guess you need paste(c('create', 'drop', 'select'), collapse="|") Commented Mar 12, 2016 at 19:41

3 Answers 3

98

An even simpler way, in my opinion, for your quite short list of strings you want to find can be:

queries <- httpdf %>% filter(str_detect(payload, "create|drop|select"))

As this is actually what

[...] paste(c("create", "drop", "select"),collapse = '|')) [...]

does, as recommended by @penguin before.

For a longer list of strings you want to detect I would first store the single strings into a vector and then use @penguin's approach, e.g.:

strings <- c("string1", "string2", "string3", "string4", "string5", "string6")
queries <- httpdf %>% 
  filter(str_detect(payload, paste(strings, collapse = "|")))

This has the advantage that you can easily use the vector strings later on as well if you want to or have to.

Sign up to request clarification or add additional context in comments.

Comments

47

This is a way to solve this problem:

queries1 <- httpdf %>% 
  filter(str_detect(payload, paste(c("create", "drop", "select"),collapse = '|')))

2 Comments

With this example I'm getting "creator" (from "the creator is nice") because of "creat", how do I match only the exact word?
Just a heads up that you need to escape reserved regex characters in your strings, for instance replace "." with "\\.", etc.
2

I suggest to use loops for such operations. It is much more versatile, IMHO.

An example httpdf table (also to answer the comment of RxT):

httpdf <- tibble(
  payload = c(
    "the createor is nice",
    "try to create something to select",
    "never catch a dropping knife",
    "drop it like it's hot",
    NA,
    "totaly unrelated" ),
  other_optional_columns = 1:6 )

I use sapply to loop over the search query and apply each string as an individual pattern to str_detect. This returns a matrix with one column per search query sting and one line per subject string, which can be collapsed to return a logical vector of your desire.

queries1 <-
  httpdf[ 
    sapply(
      c("create", "drop", "select"),
      str_detect,
      string = httpdf$payload ) %>%
    rowSums( na.rm = TRUE ) != 0, ]

And of course it can be wrapped in a function to use inside a tidyverse filter:

## function
str_detect_mult <-
  function( subject, query ) {
    sapply(
      query,
      str_detect,
      string = subject ) %>%
    rowSums( na.rm = TRUE ) != 0
}
## tidy code
queries1 <- httpdf %>% filter( str_detect_mult( payload, c("create", "drop", "select") ) )

Easily handle word boarders if you want exact word matches (the "\\b" matches a word border and is joined to the start and end of the string):

str_detect_mult_exact <-
  function( subject, query ) {
    sapply(
      query,
      function(.x)
        str_detect(
          subject,
          str_c("\\b",.x,"\\b") ) ) %>%
    rowSums( na.rm = TRUE ) != 0
}

Easily handle multiple matches (e.g. if you want only lines matching exactly one of the strings, i.e. XOR):

str_detect_mult_xor <-
  function( subject, query ) {
    sapply(
      query,
      str_detect,
      string = subject ) %>%
    rowSums( na.rm = TRUE ) == 1
}

Also works in base R:

## function
str_detect_mult <-
  function( subject, query ) {
    rowSums(sapply(
      query,
      grepl,
      x = subject ), na.rm = TRUE ) != 0
}
## tidy code
queries1 <- httpdf[ str_detect_mult( httpdf$payload, c("create", "drop", "select") ), ]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.