2

So I'm trying to make a php function to get HTML tags from a BBCode-style form. The fact is, I was able to get tags pretty easily with preg_replace. But I have some troubles when I have a bbcode inside the same bbcode...

Like this :

[blue]My [black]house is [blue]very[/blue] beautiful[/black] today[/blue]

So, when I "parse" it, I always have remains bbcode for the blue ones. Something like :

My house is [blue]very[/blue] beautiful today

Everything is colored except for the blue-tag inside the black-tag inside the first blue-tag.

How the hell can I do that ?

With more informations, I tried :

Regex: "/\[blue\](.*)\[\/blue\]/si" or "/\[blue\](.*)\[\/blue\]/i"
Getting : "My house is [blue]very[/blue] beautiful today"

Regex : "/\[blue\](.*?)\[\/blue\]/si" or "/\[blue\](.*)\[\/blue\]/Ui"
Getting : "My house is [blue]very beautiful today[/blue]"

Do I have to loop the preg_replace ? Isn't there a way to do it, regex-style, without looping the thing ?

Thx for your concern. :)

4
  • I'd suggest a search for "php bbcode library" is where you want to look. Parse it into HTML and then deal with it using appropriate DOM handling tools. Don't try and reinvent the wheel. Commented Jan 27, 2017 at 22:19
  • Can you clarify further please? As far as I can understand you are replacing BBCode tags with html tags? Commented Jan 27, 2017 at 22:20
  • @UmurKaragöz Exactly. It starts from a bbcode, and I want it in html tags ! miken32 You're right I shouldn't reinvent the wheel, however i'm curious and i'd like to know how i can do that :) Commented Jan 27, 2017 at 22:26
  • Please don't do shortcodes by hand, use a well-established library like my Shortcode, it will allow you do replace them with whatever you want: github.com/thunderer/Shortcode Commented Apr 21, 2017 at 13:16

2 Answers 2

1

It is right that you should not reinvent the wheel on products and rather choose well-tested plugins. However, if you are experimenting or working on pet projects, by all means, go ahead and experiment with things, have fun and obtain important knowledge in the process.

With that said, you may try following regex. I'll break it down for you on below.

(\[(.*?)\])(.*?)(\[/\2\])

Philosophy

While parsing markup like this, what you are actually seeking is to match tags with their pairs.

So, a clean approach you can take would be running a loop and capturing the most outer tag pair each time and replacing it.

So, on the given regex above, capture groups will give you following info;

  1. Opening tag (complete) [black]
  2. Opening tag (tag name) black
  3. Content between opening and closing tag My [black]house is [blue]very[/blue] beautiful[/black] today
  4. Closing tag [/blue]

So, you can use $2 to determine the tag you are processing, and replace it with

<tag>$3</tag>
// or even
<$2>$3</$2>

Which will give you;

// in first iteration
<tag>My [black]house is [blue]very[/blue] beautiful[/black] today</tag>

// in second iteration
<tag>My <tag2>house is [blue]very[/blue] beautiful</tag2> today</tag>

// in third iteration
<tag>My <tag2>house is <tag3>very</tag3> beautiful</tag2> today</tag>

Code

$text = "[blue]My [black]house is [blue]very[/blue] beautiful[/black] today[/blue]";

function convert($input)
{
    $control = $input;

    while (true) {
        $input = preg_replace('~(\[(.*?)\])(.*)(\[/\2\])~s', '<$2>$3</$2>', $input);

        if ($control == $input) {
            break;
        }

        $control = $input;
    }

    return $input;
}


echo convert($text);
Sign up to request clarification or add additional context in comments.

7 Comments

Nice one... I'll take a look. Thx for your help :) But, actually, your ase using a loop... Isn't it easier to just loop all the paterns and remplacements ? Cause this loop could last a long time, or am i missing something ?
Recursive processing is the only way to solve this problem. So, iteration will take place in either cases. You can only put it behind the scenes. I have added some code sample. Please inform me in case you test it, I am interested to hear if that works with a good performance or not.
Nice code, I just tested it : it seems to work pretty well. However, I don't want to replace all the BBcode "tags" by html tags (like [thing] shouldn't be replaced by <thing>)... So I'm guessin' I could use my $pattern, $replacements ? I just take a look at the preg_replace_callback thing, but people seems to say "don't use it for BBcode !!"... ^^
Surely, you can replace preg_replace with preg_replace_callback and use it like that. I myself haven't parsed bbcode before but in your place, I'd take few popular bbcode converters and inspect them to see how they are dealing with the task. And ask 'people' what would be the alternative.
In all cases : thx a lot ! I'll deal with what you and @Jan showed me ^^
|
0

As others mentionned, don't try to reinvent the wheel.
However, you could use a recursive approach:

<?php

$text = "[blue]My [black]house is [blue]very[/blue] beautiful[/black] today[/blue]";

$regex = '~(\[ ( (?>[^\[\]]+) | (?R) )* \])~x';

$replacements = array(  "blue" => "<bleu>", 
                        "black" => "<noir>", 
                        "/blue" => "</bleu>",
                        "/black" => "</noir>");

$text = preg_replace_callback($regex,
    function($match) use ($replacements) {
        return $replacements[$match[2]];
    },
    $text);

echo $text;
# <bleu>My <noir>house is <bleu>very</bleu> beautiful</noir> today</bleu>

?>

Here, every colour tag is replaced by its French (just made it up) counterpart, see a demo on ideone.com. To learn more about recursive patterns, have a look at the PHP documentation on the subject.

1 Comment

Thx a lot for those informations. I'll take a look on that !

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.