Sorry for my bad grammar, English is not my main language :)
I'm developing a fully ajax frontend based wordpress theme based on user comments and I wanted to add a rich text editor (Froala). And as we all know that; that brings a lot of security problems.
I don't want to use HTML Purifier library, it's too heavy.
I guess I found a good way to sanitize all of my data sent from the users but I'm stuck.
My sanitize opinion->
On submit form -> get all HTML data and convert it to bbcode like style with javascript.
var htmlToBBCode = function(html) {
...
html = html.replace(/<a(.*?)href="(.*?)"(.*?)>(.*?)<\/a>/gi, "[url url=$2 kelime=$4]");
html = html.replace(/<textarea(.*?)>(.*?)<\/textarea>/gmi, "\[code]$2\[\/code]");
html = html.replace(/<b>/gi, "[b]");
html = html.replace(/<\/b>/gi, "[/b]");
html = html.replace(/<img(.*?)width="(.*?)"(.*?)height="(.*?)"(.*?)src="(.*?)"(.*?)>/gi, "[img weight=$2 height=$4 src=$6]");
...
}
let editor = new FroalaEditor('#entry_html_input', {}, function () {
});
var bbcode = htmlToBBCode(editor.html.get());
On the server side -> sanitize_text_field() all $_POST["comment"] (So I can protect against ppl who sent a dirty xss code via console - ajax)
$clean_comment = sanitize_textarea_field(nl2br(wp_unslash($_POST["comment"])));
On the server side -> Use add_shortcode() function of wordpress.
function img_shortcode($atts, $content)
{
$weight = intval($atts["weight"]);
$height = intval($atts["height"]);
$src = esc_url($atts["src"])
$return = '<center><img src="' . $src . '" width="'.$weight.'" height="'.$height.'" rel="nofollow"/></center>';
return $return;
}
add_shortcode('img', 'img_shortcode');
function b_shortcode($atts, $content)
{
$bold_text = sanitize_text_field($content);
return '<b>'.$bold_text.'</b>';
}
add_shortcode('b', 'b_shortcode');
And it works perfect! Correct if I'm wrong, this is a fully secure way against XSS. I know all paramaters passed to shortcode and I know how to deal with them. If they bypasses bbcode converter there is a sanitize_text_field(); when $_POST arrived.
But this is where I'm stuck...
When editor sends a HTML like this:
<p>
<strong>
sdfsfsdfsfd
<img src="https://i.ibb.co/9bxPvhM/86a9d5b8bc498dc5eb3689f0b983a5a7a3f1c1bb.jpg" style="width: 300px;" class="fr-fic fr-dib">
qwrrwqqrwqwrrqw
</strong>
</p>
Yeah bbconverter trying to this:
[b]
sdfsfsdfsfd
[img]
https:/i.ibb.co/9bxPvhM/86a9d5b8bc498dc5eb3689f0b983a5a7a3f1c1bb.jpg
[/img]
qwrrwqqrwqwrrqw
[/b]
And sanitize_text_field(); inside b_shortcode function compelety killing-removing the image...
This is only a simple example. As you can imagine this is happening with "i","sub","sup" tags either.
I tried solutions like this:
html = html.replace(/<strong>(.*?)<img(.*?)src="(.*?)"(.*?)style="(.*?)"(.*?)>(.*?)<\/strong>/gi, "[bold_image_mixed text_before=$1 text_after=$7 img_url=$3 img_width=$5]");
And yeah this is works. But there are so many tags and combinations that can broke my image or my youtube video (iframe) with those tags.
How can I prevent that? Whats your thought about my way to sanitize html input?
I hope somebody can help me...