Multidimentional BBCODE

Question

I am trying to make myself a BBCODE parser in PHP.

Now I have the following Regex:

\[quote\](.*?)\[\/quote\]

This should replace with:

<div class='quote'><div class='quotetext'>$1</div></div>

This works all perfect until i have a "multidimentional" post Example:

[quote] [quote] [quote] text [/quote] [/quote] [/quote]

This should have the following outcome:

<div class='quote'><div class='quotetext'>
      <div class='quote'><div class='quotetext'>
           <div class='quote'><div class='quotetext'>
           text
           </div></div>
      </div></div>
</div></div>

Right now it gets the following outcome:

<div class='quote'><div class='quotetext'> [quote] [quote] text </div></div> [/quote] [/quote]

Php:

preg_replace("/\[quote\](.*?)\[\/quote\]/", "<div class='quote'><div class='quotetext'>$1</div></div>", $text);

I hope someone could help me with this issue. Thanks

Where is the PHP code that does this? Can you add this to your question? — KIKO Software
– KIKO Software, Commented Sep 15, 2021 at 14:42
Um, sure. But the website regexr does this aswell, and is not built in PHP. So I don't think it is a PHP issue. I have added the code that does this to the question — Timberman
– Timberman, Commented Sep 15, 2021 at 14:45
Yes, you're right, but I have to ask (see the answer I gave here). Your preg_replace() probably doesn't do what you think it does, it takes the first [quote] and the first [/quote], not the outer ones. In this case using regular expressions will probably not be the correct solution. Yes, they do have a place in finding things when making this parser, but without building a semi-real DOM, like HTML has, I don't think this will ever work. — KIKO Software
– KIKO Software, Commented Sep 15, 2021 at 14:57
Thinking about it, actually it doesn't matter. Simply do two separate replacements: 1. Replace [quote] by <div class='quote'><div class='quotetext'>. And 2. Replace [/quote] by </div></div>. As long as the BBCODE is valid this should work out fine. — KIKO Software
– KIKO Software, Commented Sep 15, 2021 at 15:02

Casimir et Hippolyte · Accepted Answer · 2021-09-16 14:40:27Z

2

A regex approach in one pass:

construct an array which associates a bbcode tag with the corresponding html code.
write a pattern able to match nested (or not) quote bbcode tags. The interest will be double, because it will allow to extract only valid parts (that are balanced), to then proceed to the replacement.
proceed to a simple replacement with strtr inside a callback function using the associative array.

Pro: this is relatively fast since it needs only one pass and because of the use of strtr.
Cons: It isn't flexible because it will take in account only tags like [quote] and not [quote param="bidule"] or [QUOTE]. (however nothing forbids to write a more elaborated callback function and to change the pattern a little).

$corr = [
    '[quote]' => '<div class="quote"><div class="quotetext">',
    '[/quote]' => '</div></div>'
];

$pat = '~ \[quote]
          # all that is not a quote tag
          (?<content> [^[]*+ (?: \[ (?! /?quote] ) [^[]* )*+ )
          # an eventual recursion ( (?R) is a reference to the whole pattern)
          (?: (?R) (?&content) )*+
          \[/quote]
        ~x';

$result = preg_replace_callback($pat, fn($m) => strtr($m[0], $corr), $str);

A more classical approach with several passes:

Build a pattern that forbids nested quote tags, this way, only inner tags are replaced.
put the replacement in a while loop and stop it when there's no more tags to replace (use the preg_replace count parameter to know that)

$pat = '~ \[quote] ( [^[]*+ (?: \[ (?! /? quote] ) [^[]* )*+ ) \[/quote] ~x';
$repl = '<div class="quote"><div class="quotetext">$1</div></div>';

$result = $str;
$count = 0;

do {
    $result = preg_replace($pat, $repl, $result, -1, $count);
} while($count);

pro: more flexible than the first approach since you can easily change the pattern and the replacement string.
cons: clearly slower since you need n+1 loops where n is the max nesting level.

As an aside: for what reason you want to replace a poor [quote] tag with two divs when you need only one html tag and when the blockquote tag exists!

edited Sep 16, 2021 at 14:40

answered Sep 15, 2021 at 15:42

Casimir et Hippolyte

90k5 gold badges102 silver badges131 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Timberman Over a year ago

You, are a lifesaver! We had one more problem, but that was caused by the "ignore whitespace" filter

Casimir et Hippolyte Over a year ago

@Timberman: you can include a literal whitespace in a pattern with the ignore whitespace/comment/verbose modifier x in three ways: 1. escape it with a backslash \ , 2. put it inside a character class [ ], 3. put it inside a quoted part: \Q \E. It's also possible to switch off this modifier inside a group like that: (?-x:.....)

Timberman Over a year ago

I figured! Thanks! @Casimir Et Hippolyte

Collectives™ on Stack Overflow

Multidimentional BBCODE

1 Answer 1

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related