I have a little regex that replaces non-printable characters with empty string. (the ones that are not supported in an XML document)
The size of the incoming data is quite large, so if this Replace takes more than a few milliseconds, I'd want to cancel it and return the original string back.
Below is my code, but I can't seem to hit the catch block even if I provide a Timespan of 1ms. Logs from the stopwatch show that it took well over 10 ms.
What am I doing wrong here?
Does this work only if it doesn't find a match within the given timespan?
What's the best way to test this?
Update - I tested the below regex using a large file (4 MB) that hasn't got any Non-printable characters. Regex took 79ms, yet no exception was thrown.
private static string CleanUpNonPrintableCharacters(string incomingString)
{
var stopWatch = new Stopwatch();
try
{
stopWatch.Start();
var timeSpan = TimeSpan.FromMilliseconds(1);
var cleanedUpString = Regex.Replace(incomingString, @"[\u0000-\u0008\u000B\u000C\u000E-\u001F]", string.Empty, RegexOptions.None, timeSpan);
stopWatch.Stop();
Console.Log(stopWatch.ElapsedMilliseconds);
//Above was 79 ms on a file that doesn't have a match, yet no exception was thrown
if (cleanedUpString.Length < incomingString.Length)
{
//do some logging
}
return cleanedUpString;
}
catch (RegexMatchTimeoutException ex)
{
//do some logging
return incomingString;
}
finally
{
//stopWatch.Stop();
//log elapsed
}
}
Taskand issuing aCancellationTokenwithTimeout, but somehow I feel that this should work out of the box, sinceRegex.Replaceoffers the timeout option. @AkashKC, I've updated my question with the results from my most-recent test.while(1){}how can it be stopped? Does it have to be killed, probably. So, the regex code is similar, only catching, polling what it wants.