2

I have simple string replace which look for specific word and replace with something else.
i.e. if i have a key - bla and value - boo, below will produce

 var input ="bla bla test test1 test3...";

 foreach (var obj in dictionary)
 {
    inputText = Regex.Replace(inputText, obj.Key, obj.Value); 
 }

var output = "boo boo test test1 test3...";

Now I have html coming as input where now input can be

"bla bla test test1 test3. Go to www.something.com/bla/something" which ends up as

"boo boo test test1 test3. Go to www.something.com/boo/something"

(this content displayed in a html viwer)

Here I want to skip the replacement in the url so it will just do the replacement for everything but not the url. Is it something that possible

1 Answer 1

1

Yes, you may match a substring that looks like a URL and keep that text, else perform a replacement.

The code will look like

inputText = Regex.Replace(inputText, $@"\b(https?://\S+|www\.\S+)|{Regex.Escape(obj.Key)}", m =>
                    m.Groups[1].Success ? m.Groups[1].Value : obj.Value); 

Note I used a Regex.Escape to escape potential special chars in the obj.Key with Regex.Escape(obj.Key).

The \b(https?://\S+|www\.) matches a whole word (as \b is a word boundary) http or https and then :// and 1+ non-whitespace characters or www. and 1+ non-whitespace chars. So, if the regex matches a URL, it will be put in m.Groups[1] and inside the match evaluator, the replacement will be the same URL text, else, the obj.Value will be used as replacement text.

There can be another problem with this approach though, namely, replacing the same text two or more times. Then, you'd need to create a regex with alternations based on your dictionary keys, and then use the match evaluator to get the right value based on the key match.

So, I'd recommend something like

var dct = new Dictionary<string, string>();
dct.Add("bla", "boo");
dct.Add("bla test", "ZZZ");
var pat = $@"\b(https://\S+|www\.\S+)|(?:{string.Join("|",dct.Keys.Select(k => Regex.Escape(k)).OrderByDescending(x => x.Length))})";
// Console.WriteLine(pat); => \b(https://\S+|www\.\S+)|(?:bla\ test|bla)
var input ="bla bla test test1 test3. Go to www.something.com/bla/something";
var output = Regex.Replace(input, pat, m => m.Groups[1].Success ? m.Groups[1].Value : dct[m.Value]); 
Console.Write(output);
// => boo ZZZ test1 test3. Go to www.something.com/bla/something

See the C# demo.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.