2

I need to parse a string that contains some parentheses disposed recursively, but i'm having trouble with determining priority of parentheses. For exemple, I have the string

$truth = "((A^¬B)->C)";

and I need to return what is between the parentheses. I've already done it with the following regex:

preg_match_all("~\((.*?)\)~", $truth, $str);

But the problem is that it returns what is between the first "(" and the first ")", which is

(A^¬B

Instead of this, i need it to 'know' where the parentheses closes correctly, in order to return

(A^¬B)->C

How can I return this respecting the priority order? Thanks!

2
  • You could just make an exclusion group and match anything but parenthesis with [^\(\)]* instead of .*, but you might probably still run into problems depending on the complexity of the expression you're trying to parse, specially if it's malformed. Regular expressions are handy but they don't apply to every parsing problem. Commented Nov 11, 2018 at 1:43
  • Regular expressions are not adequate for parsing a language. Try a parser generator. stackoverflow.com/questions/3720362/… Commented Nov 11, 2018 at 1:46

2 Answers 2

3

The main problem you have right now is the ? non-greedy bit. If you change that to just .+ greedy it will match what you want.

$truth = "((A^¬B)->C)";
preg_match('/\(.+\)/', $truth, $match);

Try it

Output

(A^¬B)->C

If you want to match the inner pair you can use a recursive subpattern:

$truth = "((A^¬B)->C)";
preg_match('/\(([^()]+|(?0))\)/', $truth, $match);

Try It online

Output

A^¬B

If you need to go further then that you can make a lexer/parser. I have some examples here:

https://github.com/ArtisticPhoenix/MISC/tree/master/Lexers

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks! It solved my problem. And thanks to the others too, it will be useful. =D
Sure I just added my output converter to my website, artisticphoenix.com/2018/11/11/output-converter it uses the same parsing idea but can convert var_export and print_r to usable arrays. Something I have to do a lot on here... lol
@ArtisticPhoenix I was just thinking I was going to have to write the same tool myself! Thanks for sharing...
Sure, My site is still a work in progress. lol. I don't get a lot of time to work on it unfortunately
3

For your sample string, something like this will recursively give you the contents of the parentheses. It works by forcing the parentheses matched to be the outermost pair by using ^[^(]* and [^)]*$ at each end of the regex.

$truth = "((A^¬B)->C)";
while (strpos($truth, '(') !== false) {
    preg_match("~^[^(]*\((.*?)\)[^)]*$~", $truth, $str);
    $truth = $str[1];
    echo "$truth\n";
}

Output

(A^¬B)->C 
A^¬B

Note however this will not correctly parse a string such as (A+B)-(C+D). If that could be your scenario, this answer might help.

Demo on 3v4l.org

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.