2

I am trying to figure out a regex which matches the following conditions in a query parameter. I need to find if the text has an and or or operator passed in the query parameters. I can have a URI like http:$URL/$RESOURCE?$filter="firstName eq 'John' and tenantId eq '32323232'"&$order="asc.

Text 1: firstName eq 'John' and tenantId eq '32323232'
Text 2: firstName like 'J%' or companyName eq 'IBM'
Text 3: companyName like 'John and Sons'

While the following regex pattern does work for Text 1 and Text 2, however I need a way to filter out Text 3, since the and here comes inside a value. Values shall always be in quotes, so any and or or values in quotes should be ingored by regex. Any help to filter out cases like Text 3 shall be appreciated. Thanks

public static boolean hasANDorORoperator(String filter) {
    return filter.matches("^(.*?)\\s+(?i)(or|and)\\s+(.*?)$");
}
7
  • tenantId is just a key of a key-value pair. Commented Nov 12, 2013 at 21:53
  • @Flimzy: Doesnt really matter, does it? Commented Nov 12, 2013 at 21:53
  • @Daemon: That's a modifier that is only applied to that capturing group. Commented Nov 12, 2013 at 21:54
  • Are you expecting escaped quotes in your texts? Commented Nov 12, 2013 at 21:55
  • Welcome to Stack Overflow. Please read the About page soon. With regex questions, it is usually helpful (necessary) to identify which language are you embedding your regex in? It looks like Java or C# to my eyes, but they're unskilled in both languages. Commented Nov 12, 2013 at 21:57

3 Answers 3

3
(and|or)(?=(?:[^']*'[^']*')*[^']*$)

will only match and or or if an even number of quotes follow. So if you're inside a string, that condition is not met and the match fails.

See it on regex101.

Explanation:

(and|or)  # Match and/or.
(?=       # only if the following can be matched here:
 (?:      # Start of non-capturing group:
  [^']*'  # Match any number of non-quote characters plus a quote
  [^']*'  # twice in a row.
 )*       # Repeat any number of times, including zero.
 [^']*    # Match any remaining non-quote characters
 $        # until the end of the string.
)         # End of lookahead assertion.
Sign up to request clarification or add additional context in comments.

2 Comments

+1 you beat me to it! Although I prefer {2} to repeating the non-quote/quote sequence
+1 Thanks for the link, very useful online regex checking tool.
1

If I were you, I'd pull out all of the strings first, like in Text 3's example. I'd first filter out 'John and Sons'.

Then, you'd only be left with the raw commands which you could match with the (.*)\s+(and|or)\s+(.*) regular expression.

Then you wouldn't have to deal with the resulting complicated regular expression.

Comments

0
/^((.*)('[^']')?)*(and|or)[^']*$/i

should do the trick. I'm capturing anything inside parantheses before matching the end/or, so it shouldn't be a possible match for the end/or anymore. Because most regexp engines backtrack to match later capturing groups, I included the no ' at the end.

1 Comment

Single quotes ', not double quotes ".

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.