Solution
Regex
The following regex matches nested brackets recursively, finding an opening 1( on the first level, and an opening 3( on the second level (as a direct child). It also attempts successive matches, either on the same level or going down the respective levels to find another match.
~
(?(?=\A) # IF: First match attempt (if at start of string) - -
# we are on 1st level => find next "1("
(?<balanced_brackets>
# consumes balanced brackets recursively where there is no match
[^()]*+
\( (?&balanced_brackets)*? \)
)*?
# match "1(" => enter level 2
1\(
| # ELSE: Successive matches - - - - - - - - - - - - - -
\G # Start at end of last match (level 3)
# Go down to level 2 - match ")"
(?&balanced_brackets)*?
\)
# or go back to level 1 - matching another ")"
(?>
(?&balanced_brackets)*?
\)
# and enter level 2 again
(?&balanced_brackets)*?
1\(
)*?
) # - - - - - - - - - - - -
# we are on level 2 => consume balanced brackets and match "3("
(?&balanced_brackets)*?
3\K\( # also reset the start of the match
~x
Replacement
(5()
Text
Input:
1(8()3(6()7())9()3())2(4())3()1(0()3())
Output:
1(8()3(5()6()7())9()3(5()))2(4())3()1(0()3(5()))
^^^ ^^^ ^^^
[1] [2] [3]
regex101 demo
How it works
We start by using a conditional subpattern to distinguish between:
- the first match attempt (from level 1) and
- the successive attempts (starting at level 3, anchored with the
\G assertion).
(?(?=\A) # IF followed by start of string
# This is the first attempt
| # ELSE
# This is another attempt
\G # and we'll anchor it to the end of last match
)
For the first match, we'll consume all nested brackets that don't match 1(, in order to get the cursor to a position in the first level where it could find a successful match.
- This is a well-known recursive pattern to match nested constructs. If you're unfamiliar with it, please refer to
Recursion and Subroutines.
(?<balanced_brackets> # ANY NUMBER OF BALANCED BRACKETS
[^()]*+ # match any characters
\( # opening bracket
(?&balanced_brackets)*? # with nested bracket (recursively)
\) # closing bracket in the main level
)*? # Repeated any times (lazy)
Notice this is a named group that we will use as a subroutine call many times in the pattern to consume unwanted balanced brackets, as (?&balanced_brackets)*?.
Next levels. Now, to enter level 2, we need to match:
1\(
And finally, we'll consume any balanced brackets until we find the opening of the 3rd level:
(?&balanced_brackets)*?
3\(
That's it. We've just matched our first occurrence, so we can insert the replacement text in that position.
Next match. For the successive match attempts, we can either:
- go down to level 2 matching a closing
) to find another occurrence of 3(
- go further down to level 1 matching 2 closing
) and, from there, match using the same strategy as we used for the first match.
This is achieved with the following subpattern:
\G # anchored to the end of last match (level 3)
(?&balanced_brackets)*? # consume any balanced brackets
\) # go down to level 2
#
(?> # And optionally
(?&balanced_brackets)*? # consume level 2 brackets
\) # to go down to level 1
(?&balanced_brackets)*? # consume level 1 brackets
1\( # and go up to level 2 again
)*? # As many times as it needs to (lazy)
To conclude, we can match the opening of the 3rd level:
(?&balanced_brackets)*?
3\(
We'll also reset the start of match near the end of the pattern, with \K, to only match the last opening bracket. Thus, we can simply replace with (5(), avoiding the use of backreferences.
PHP Code
We only need to call preg_replace() with the same values used above.
Ideone demo
Why did your regex fail?
Since you asked, the pattern is anchored to the start of string. It can only match the first occurrence.
/^( (\((((?>[^()]+)|(?R))*)\))* 1\( (\((((?>[^()]+)|(?R))*)\))* 3\()/x
Also, it doesn't match the first occurrence because the construct (?R) recurses the the whole pattern (trying to match ^ again). We could change (?R) to (?2).
The main reason, though, is because it is not consuming the characters before any opening \(. For example:
Input:
1(8()3(6()7())9()3())2(4())3()1(0()3())
^
#this "8" can't be consumed with the pattern
There's also a behaviour that should be considered: PCRE treats recursion as atomic. So you have to make sure that the pattern will consume unwanted brackets as in the above example, but also avoid matching 1( or 3( in their respective levels.