1

Disclosure: very much a regex newbie, so I'm trying to tweak some example code I found which parses web server log data into named groups. The snippet of my modified regex thus far that deals with the URL and query string groups:

(?P<url>.+)(?P<querystr>\?.*)

This works just fine when the string against which it's applied actually does have a query string on the URL (each group gets the expected bit of the string) but fails to match if there is none. So I tried adding a '?' after the "querystr" group to indicate that it was optional, i.e. (?P<querystr>\?.*)? ... if there's no query string then it works as expected (nothing is extracted into querystr), but when there is one, it is still extracted as part of url rather than separately into querystr.

What's the best way to identify optional groups (assuming that's even the right approach in this case)? Thanks in advance.

2
  • 1
    Use ^(?P<url>.+?)(?P<querystr>\?.*)?$ or ^(?P<url>[^?]+)(?P<querystr>\?.*)?$ Commented Jul 19, 2021 at 20:10
  • Replacing the '.' in the url group with [^?] along with adding the '?' after the querystr group seemed to do the trick. Thanks! Commented Jul 19, 2021 at 20:59

1 Answer 1

1

You can use

^(?P<url>[^?]+)(?P<querystr>\?.*)?$

Details

  • ^ - start of string
  • (?P<url>[^?]+) - Group "url": any one or more chars other than ?
  • (?P<querystr>\?.*)? - an optional Group "querystr": a ? char and then any zero or more chars other than line break chars as many as possible
  • $ - end of string.

See the regex demo.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.