0

In a project I have a text with patterns like that:

{| text {| text |} text |}
more text

I want to get the first part with brackets. For this I use preg_match recursively. The following code works fine already:

preg_match('/\{((?>[^\{\}]+)|(?R))*\}/x',$text,$matches);

But if I add the symbol "|", I got an empty result and I don't know why:

preg_match('/\{\|((?>[^\{\}]+)|(?R))*\|\}/x',$text,$matches);

I can't use the first solution because in the text something like { text } can also exist. Can somebody tell me what I do wrong here? Thx

1
  • You can use Balancing groups in .NET like described here: marcomilani.it/2012/07/… Commented Jul 17, 2012 at 11:46

3 Answers 3

3

Try this:

'/(?s)\{\|(?:(?:(?!\{\||\|\}).)++|(?R))*\|\}/'

In your original regex you use the character class [^{}] to match anything except a delimiter. That's fine when the delimiters are only one character, but yours are two characters. To not-match a multi-character sequence you need something this:

(?:(?!\{\||\|\}).)++

The dot matches any character (including newlines, thank to the (?s)), but only after the lookahead has determined that it's not part of a {| or |} sequence. I also dropped your atomic group ((?>...)) and replaced it with a possessive quantifier (++) to reduce clutter. But you should definitely use one or the other in that part of the regex to prevent catastrophic backtracking.

Sign up to request clarification or add additional context in comments.

1 Comment

I just tried your solution and it works well. Thank you very much! And also thanks for the explanation, because it's not easy to understand.
1

You've got a few suggestions for working regular expressions, but if you're wondering why your original regexp failed, read on. The problem lies when it comes time to match a closing "|}" tag. The (?>[^{}]+) (or [^{}]++) sub expression will match the "|", causing the |} sub expression to fail. With no backtracking in the sub expression, there's no way to recover from the failed match.

Comments

0

See PHP - help with my REGEX-based recursive function

To adapt it to your use

preg_match_all('/\{\|(?:^(\{\||\|\})|(?R))*\|\}/', $text, $matches);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.