0

I have a form which takes HTML, converts it to BBCode and stores it in the database.

Most tag renaming is easily handled with a simple switch, however, text alignment is causing me some trouble.

HTML is <div style="text-align: right;"></div> and I need to convert to BBcode [right][/right]

So I can easily do str_replace on the opening tag but for the closing tag I am replacing but that could be [/left],[/center] or [/right]. I need to know what the opening tag is before I can set it, hence the problem.

I am bad at simple regex so this one is even more difficult.

In logic terms I am trying to do this:

$str = str_replace("</div>","$align_value",$str);

But I need to know what the opening tag is to set the correct closing tag.

The expected result is it will check what the opening tag is:

if($opening_tag = '<div style="text-align: right;">')
{
 $closing_tag = '[/right]';
} else if($opening_tag = '<div style="text-align: center;">')
{
 $closing_tag = '[/center]';
} else if($opening_tag = '<div style="text-align: left;">')
{
 $closing_tag = '[/left]';
} else {
  // Some other div that isn't aligned so do nothing
}

But they key is being able to find what the opening tag is first. Any help appreciated

2
  • 1
    You should not be using regular expressions for this, you should be using an HTML parser. <div style="text-align: right">, <div class="foo" style="text-align:right">, <div style="color:red; text-align: right;">, <div data-something="text-align: right">, etc. How do you expect to deal with all these possibilities? Commented May 22, 2019 at 22:04
  • There are also libraries to do this for you: github.com/vamsiikrishna/html-to-bbcode was the first one I came across. Commented May 22, 2019 at 22:35

1 Answer 1

1

It may not be the best idea to solve this problem with regular expressions. However, if you wish to do so, we want to get the attribute value, which I'm guessing that it would be always left, right and center, then collect our element textContents, store it in two capturing groups and then add our desired tags to it, maybe similar to:

<.+?:\s+([a-z]+);">(.+?)<\/div>

We can also change the div closing tag with a more broad expression, if necessary:

<.+?:\s+([a-z]+);">(.+?)<\/.+?>

enter image description here

Demo

const regex = /<.+?:\s+([a-z]+);">(.+?)<\/div>/gm;
const str = `<div style="text-align: right;">Anything you wish here</div>
<div style="text-align: center;">Anything you wish here</div>
<div style="text-align: left;">Anything you wish here</div>
<div style="text-align: center;">Anything you wish here</div><div style="text-align: right;">Anything you wish here</div>`;
const subst = `[$1]$2[/$1]`;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: ', result);

DEMO

PHP

$re = '/<.+?:\s+([a-z]+);">(.+?)<\/.+?>/m';
$str = '<div style="text-align: right;">Anything you wish here</div>
<div style="text-align: center;">Anything you wish here</div>
<div style="text-align: left;">Anything you wish here</div>
<div style="text-align: center;">Anything you wish here</div><div style="text-align: right;">Anything you wish here</div>';
$subst = '[$1]$2[/$1]';

$result = preg_replace($re, $subst, $str);

echo $result;

RegEx

If this expression wasn't desired, it can be modified or changed in regex101.com.

RegEx Circuit

jex.im visualizes regular expressions:

enter image description here


Based on Niet the Dark Absol's advice in the comment, this method would not work with nested tags.

Sign up to request clarification or add additional context in comments.

2 Comments

<div style="text-align: right;"><div style="text-align: center;">Nested tags are a problem for your code.</div></div>
Yes it would need to work for nested tags also but otherwise the solution was exactly what I was looking for.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.