5

I have a function that is walking through the string looking for pattern and changing parts of it. I could optimize it by inserting

if (!text.Contains(pattern)) return;

But, I am actually walking through the whole string and comparing parts of it with the pattern, so the question is, how String.Contains() actually works? I know there was such a question - How does String.Contains work? but answer is rather unclear. So, if String.Contains() walks through the whole array of chars as well and compare them to pattern I am looking for as well, it wouldn't really make my function faster, but slower.

So, is it a good idea to attempt such an optimizations? And - is it possible for String.Contains() to be even faster than function that just walk through the whole array and compare every single character with some constant one?

Here is the code:

    public static char colorchar = (char)3;

    public static Client.RichTBox.ContentText color(string text, Client.RichTBox SBAB)
    {
        if (text.Contains(colorchar.ToString()))
        {
            int color = 0;
            bool closed = false;
            int position = 0;
            while (text.Length > position)
            {
                if (text[position] == colorchar)
                {
                    if (closed)
                    {
                        text = text.Substring(position, text.Length - position);
                        Client.RichTBox.ContentText Link = new Client.RichTBox.ContentText(ProtocolIrc.decode_text(text), SBAB, Configuration.CurrentSkin.mrcl[color]);
                        return Link;
                    }

                    if (!closed)
                    {
                        if (!int.TryParse(text[position + 1].ToString() + text[position + 2].ToString(), out color))
                        {
                            if (!int.TryParse(text[position + 1].ToString(), out color))
                            {
                                color = 0;
                            }
                        }
                        if (color > 9)
                        {
                            text = text.Remove(position, 3);
                        }
                        else
                        {
                            text = text.Remove(position, 2);
                        }
                        closed = true;
                        if (color < 16)
                        {
                            text = text.Substring(position);
                            break;
                        }
                    }
                }
                position++;
            }
        }
        return null;
    }
7
  • You'd have to post (an outline of) the rest of the code. But yes, most likely you're doing double work here. Commented Apr 19, 2013 at 9:02
  • 4
    Why don't you time both approaches in a for loop ? Commented Apr 19, 2013 at 9:03
  • 2
    "So, is it a good idea to attempt such an optimizations?" Almost certainly not! This must be one of the most-used functions in the library. It has gotten a lot of attention. Hard to imagine you could improve on it except for very special cases and by really dedicated specialists. Commented Apr 19, 2013 at 9:06
  • 1
    Going through a string, matching patterns, changing strings... maybe you also want to time how a Regex performs. Commented Apr 19, 2013 at 9:10
  • 1
    You also may want to replace your code that finds a position with IndexOf method of string. like position = text.IndexOf(colorchar); and then position = text.IndexOf(colorchar, position); Commented Apr 19, 2013 at 9:13

5 Answers 5

1

Short answer is that your optimization is no optimization at all.
Basically, String.Contains(...) just returns String.IndexOf(..) >= 0
You could improve your alogrithm to:

int position = text.IndexOf(colorchar.ToString()...);
if (-1 < position)
{  /* Do it */ }
Sign up to request clarification or add additional context in comments.

Comments

0

Yes.

And doesn't have a bug (ahhm...).

There are better ways of looking for multiple substrings in very long texts, but for most common usages String.Contains (or IndexOf) is the best.

Also IIRC the source of String.Contains is available in the .Net shared sources

Oh, and if you want a performance comparison you can just measure for your exact use-case

Comments

0

Check this similar post How does string.contains work

I think that you will not be able to simply do anything faster than String.Contains, unless you want to use standard CRT function wcsstr, available in msvcrt.dll, which is not so easy

Comments

0

Unless you have profiled your application and determined that the line with String.Contains is a bottle-neck, you should not do any such premature optimizations. It is way more important to keep your code's intention clear.

Ans while there are many ways to implement the methods in the .NET base classes, you should assume the default implementations are optimal enough for most people's use cases. For example, any (future) implementation of .NET might use the x86-specific instructions for string comparisons. That would then always be faster than what you can do in C#.

If you really want to be sure whether your custom string comparison code is faster than String.Contains, you need to measure them both using many iterations, each with a different string. For example using the Stopwatch class to measure the time.

Comments

0

If you now the details which you can use for optimizations (not just simple contains check) sure you can make your method faster than string.Contains, otherwise - not.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.