I want to transform a valid HTML with not very deep level of nesting into another HTML with more restricted rules.
Only the following tags are supported in the resulting HTML:
<b></b>, <strong></strong>, <i></i>, <em></em>, <a
href="URL"></a>, <code></code>, <pre></pre>
Nested tags are not allowed at all.
For the rest of the tags and their combinations I have to create some sort of rules how to handle each. So I have to convert something like:
<p>text</p> into simple string text with linebreak,
<b>text <a href="url">link</a> text</b> into text link text
<a href="url">text<code> code here</code></a> into <a href="url">text code here</a> because <code> is nested inside <a> and so on.
For example HTML (linebreaks are only for convenience):
<p>long paragraph <a href="url">link</a> </p>
<p>another text <pre><code>my code block</code></pre> the rest of description</p>
<p><code>inline monospaced text with <a href="url">link</a></code></p>
Should be transformed into:
long paragraph <a href="url">link</a>
another text <code>my code block</code> the rest of description
<code>inline monospaced text with link</code>
Any suggestion on the way to solve that?