Raku has an interesting and exciting recursive-regex notation: <~~>.
So in the REPL, we can do this:
[0] > 'hellohelloworldworld' ~~ m/ helloworld /;
「helloworld」
[1] > 'hellohelloworldworld' ~~ m/ hello <~~>? world /;
「hellohelloworldworld」
Going directly from the Raku Docs for Recursive Regexes, we can capture/count various levels of nesting:
~$ raku -pe '#acts like cat here' nest_test.txt
not nested
previous blank
nestA{1}
nestB{nestA{1}2}
nestC{nestB{nestA{1}2}3}
~$ raku -ne 'my $cnt = 0; say m:g/ \{ [ <( <-[{}]>* )> | <( <-[{}]>* <~~>*? <-[{}]>* )> ] \} {++$cnt} /, "\t $cnt -levels nested";' nest_test.txt
() 0 -levels nested
() 0 -levels nested
() 0 -levels nested
(「1」) 1 -levels nested
(「nestA{1}2」) 2 -levels nested
(「nestB{nestA{1}2}3」) 3 -levels nested
(Above, change say to put to only return the captured string).
But I recently ran into an issue trying to solve a Unix & Linux question, which is: how to limit the recursion? Let's say we want to only capture below nestB. Is there anyway to do this using the <~~> recursive regex syntax?
~$ raku -ne 'my $cnt = 0; say m:g/ nestB \{ [ <( <-[{}]>* )> | <( <-[{}]>* <~~>*? <-[{}]>* )> ] \} {++$cnt} /, "\t $cnt -levels nested";' nest_test.txt
() 0 -levels nested
() 0 -levels nested
() 0 -levels nested
() 0 -levels nested
() 0 -levels nested
() 0 -levels nested
NOTE: Above I've tried to force some sort of 'frugal recursive behavior' by using <~~>*?. The truth is <~~> (standard recursive notation), <~~>?, <~~>*, and <~~>*? all give identical results (rakudo-moar-2024.09-01).