Regex in .net seems to not work correctly

Question

I want strip html from string with regular expression and while this regex works everywhere it does not work in .net I don't understand why.

using System;
                        
public class Program
{
    public static void Main()
    {
        var text = "FOO <span style=\"mso-bidi-font-size:11.0pt;\nmso-fareast-language:EN-US\"> BAR";
        var res = System.Text.RegularExpressions.Regex.Replace(text, "<.*?>", "");
        Console.WriteLine(res);
    }
}

See Regex that matches a newline (\n) in C#

The fourth bird
– The fourth bird

2022-07-13 08:46:26 +00:00
Commented Jul 13, 2022 at 8:46 — The fourth bird
– The fourth bird, Commented Jul 13, 2022 at 8:46

ProgrammingLlama · Accepted Answer · 2022-07-13 08:23:28Z

5

You're missing the correct Regex option:

var res = System.Text.RegularExpressions.Regex.Replace(text, "<.*?>", "", RegexOptions.Singleline);

The reason you need this is because you have a newline (\n) in your HTML. Singleline will ensure that . even matches newline characters.

Docs blurb:

Specifies single-line mode. Changes the meaning of the dot (.) so it matches every character (instead of every character except \n). For more information, see the "Single-line Mode" section in the Regular Expression Options article.

Docs

Try it online

answered Jul 13, 2022 at 8:23

ProgrammingLlama

39.4k7 gold badges79 silver badges105 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Antonio Skopin · Accepted Answer · 2022-07-13 08:21:04Z

0

Try this:

System.Text.RegularExpressions.Regex.Replace(text, "<[^>]*>", "");

This will strip the html of your string.

answered Jul 13, 2022 at 8:21

Antonio Skopin

315 bronze badges

Collectives™ on Stack Overflow

Regex in .net seems to not work correctly

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related