I have prepared a white list of allowed styles and I want to remove all the styles out of the white list from HTML String
$allowed_styles = array('font-size','color','font-family','text-align','margin-left');
$html = 'xyz html';
$html_string = '<bdoy>' . $html . '<body>';
$dom = new DOMDocument();
$dom->loadHTML($html_string);
$elements = $dom->getElementsByTagName('body');
foreach($elements as $element) {
foreach($element->childNodes as $child) {
if($child->hasAttribute('style')) {
$style = strtolower(trim($child->getAttribute('style')));
//match and get only the CSS Property name
preg_match_all('/(?<names>[a-z\-]+):/', $style, $matches);
for($i=0;$i<sizeof($matches["names"]);$i++) {
$style_property = $matches["names"][$i];
// if the css-property is not in allowed styles array
// then remove the whole style tag from this child
if(!in_array($style_property,$allowed_styles)) {
$child->removeAttribute('style');
continue;
}
}
}
}
}
$dom->saveHTML();
$html_output = $dom->getElementsByTagName('body');
I have tested so many html strings, it works fine every where. But When I tried to filter this html string
$html_string = '<div style="font-style: italic; text-align: center;
background-color: red;">On The Contrary</div><span
style="font-style: italic; background-color: rgb(244, 249, 255);
font-size: 32px;"><b style="text-align: center;
background-color: rgb(255, 255, 255);">This is USA</b></span>';
All other un allowed styles are removed from this string except this line
<b style="text-align: center; background-color: rgb(255, 255, 255);">
Can Some one tell me any other efficient and robust way to remove the styles other than the whitelist