1

I have the following Regex:

(?<day>\d+). Tag, (?<way>.+)?( \((?<length>\d+?.?\d?)km\))?

And i want to match these three possibilities:

1. Tag, Berlin -> London (500.3km)
2. Tag, London -> Stockholm (183km)
3. Tag, Stockholm (day of rest)

The problem: It doesn't match the length anymore. If I remove the questionsmarks to this:

(?<day>\d+). Tag, (?<way>.+)( \((?<length>\d+?.?\d?)km\))

It matches the first and second one not the third one. I thought I could solve the problem by adding the question mark at the end. But then the last expression becomes lazy. So I add another question mark to the way-expression but it doesn't become more lazy than the last one. So the way is matching the whole length too!

So, is it possible to define different level of lazyness? And if there this doesn't exist, how should i change the pattern to match it right?

Julian

6
  • You can't match (day of rest) with (\d+?.?)km. What is the expected output? Commented Sep 8, 2015 at 8:21
  • I think the problem is that the .+ in <way> can match the open bracket character (. Maybe use [^\(] instead? Commented Sep 8, 2015 at 8:22
  • @stribizhev: thats right, because it should not match "(day of rest)". If there is a kilometer-expression the length should match it. If there is no kilometer-expression the "way" will match it. And thats the point, the way is always matching everything including the kilometers. Commented Sep 8, 2015 at 8:30
  • I see, so (?<day>\d+)\.\s+Tag,\s+(?<way>.+?)\s+\((?<length>[^()]+)\) is not a solution, right? What is the regex flavor, BTW? Commented Sep 8, 2015 at 8:31
  • @pzelasko: Unfortunately "(day of rest)" should getting matched by the way-expression. So if i forbid the bracket, the day of rest won't get matched. Commented Sep 8, 2015 at 8:32

1 Answer 1

1

Here is a way to match all the expected elements in your input:

(?<day>\d+)\.\s+Tag,\s+(?<way>(?:[^()]|\((?!\d+(?:\.\d+)?km)[^()]*\))*?)(?:$|\s*(?<length>\(\d+(?:\.\d+)?km\)))

See demo

You can match the whole way that consists of no parenthetical constructs or with them not having integer or float numbers with km right after. Length will be matched only if present. Also note that a literal dot must be escaped (\.).

Sign up to request clarification or add additional context in comments.

4 Comments

Hm... nearly. Now the kilometers are also in the way-expression :/ the way only should match "Berlin -> London" without the (500.3km)
Well, I have re-vamped the expression a bit, does it work as expected now?
1. Tag, Berlin -> London (500.3km) day: 1 way: Berlin -> London length: 500.3 2. Tag, London -> Stockholm (183km) day: 2 way: London -> Stockholm length: 183 3. Tag, Stockholm (day of rest) day: 3 way: Stockholm (day of rest) length:
Thanks a lot for your help, works fine :) Now you have done, seems so easy -.-

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.