In order to remove also spaces between tags, you can use the following method a combination between regex and a trim for spaces at start and end of the input html:
public static string StripHtml(string inputHTML)
{
const string HTML_MARKUP_REGEX_PATTERN = @"<[^>]+>\s+(?=<)|<[^>]+>";
inputHTML = WebUtility.HtmlDecode(inputHTML).Trim();
string noHTML = Regex.Replace(inputHTML, HTML_MARKUP_REGEX_PATTERN, string.Empty);
return noHTML;
}
So for the following input:
<p> <strong> <em><span style="text-decoration:underline;background-color:#cc6600;"></span><span style="text-decoration:underline;background-color:#cc6600;color:#663333;"><del> test text </del></span></em></strong></p><p><strong><span style="background-color:#999900;"> test 1 </span></strong></p><p><strong><em><span style="background-color:#333366;"> test 2 </span></em></strong></p><p><strong><em><span style="text-decoration:underline;background-color:#006600;"> test 3 </span></em></strong></p>
The output will be only the text without spaces between html tags or space before or after html:
" test text test 1 test 2 test 3 ".
Please notice that the spaces before test text are from the <del> test text </del> html and the space after test 3 is from the <em><span style="text-decoration:underline;background-color:#006600;"> test 3 </span></em></strong></p> html.