So,
I've got some legacy HTML I'm trying to use regex to remove the cruft from. Something like
<div class="al-list-head"><span>Another List</span></p>
<h3>Destinations</h3>
</div>
Another variant in HTML could be
<div class="al-list-head">
<p><span>Another List</span></p>
<h3>Lounge</h3>
</div>
(The CMS adds in redundant <p>'s sometimes).
My regex works for the most part (second sample) but not the first. I've tried a bunch of character classes, but can't seem to match the gap between the last </h3> and the final </div> in the the first sample.
My regex is...
$html = preg_replace( '/<div class=\"al-list-head\">[\s](<p>?)(<span>Another\ List<\/span>)(<\/p>?)[\s]<h3>([^<\/>]*)<\/h3>[\s]<\/div>/is', '<h3 class="al-head">$4</h3>', $html );
After the <\h3> I've tried [\s], ([\s]?), ([\s\b\n\r]*) and even (.*) with no luck.
Any pointers?
I'm using this handy little tool to iterate and test, hopefully someone finds it useful too.
mmodifier: your regex ends with/is, try/ism.