0

I have code that is adding ~100,000 items to a List.

If I add an array of strings or objects the code runs almost instantly (under 100 ms), but if I try to add an array of structs, it takes almost 1.5 seconds just for the .Add calls.

Why is there such a performance impact when using a struct[]?

Here is my struct:

public struct LiteRowInfo
{
    public long Position;
    public int Length;
    public int Field;
    public int Row;

    public LiteRowInfo(long position, int length, int field, int row)
    {
        this.Position = position;
        this.Length = length;
        this.Field = field;
        this.Row = row;
    }
}

EDIT 2: Performances of the string method is faster than that of the struct: I appreciate the comments, it does seem like there is additional overhead in creating the struct its self. I think I will just create 2 seperate list to store the position and length to improve performance.

private void Test()
    {
        Stopwatch watch = new Stopwatch();

        watch.Start();
        List<LiteRowInfo[]> structList = new List<LiteRowInfo[]>();

        for (int i = 0; i < 100000; i++)
        {
            LiteRowInfo[] info = new LiteRowInfo[20];

            for (int x = 0; x < 20; x++)
            {
                LiteRowInfo row;
                row.Length = x;
                row.Position = (long)i;
                info[x] = row;
            }
            structList.Add(info);
        }
        Debug.Print(watch.ElapsedMilliseconds.ToString());

        watch.Reset();
        watch.Start();

        List<string[]> stringList = new List<string[]>();

        for (int i = 0; i < 100000; i++)
        {
            string[] info = new string[20];

            for (int x = 0; x < 20; x++)
            {
                info[x] = "String";
            }
            stringList.Add(info);
        }

        Debug.Print(watch.ElapsedMilliseconds.ToString());
    }

EDIT: Here is all relevant code: Note: If I comment out only the pos.Add(rowInfo); line, the performance is similar to that of a string[] or int[].

        private void executeSqlStream()
    {
        List<LiteRowInfo[]> pos = new List<LiteRowInfo[]>();

        long currentPos = 0;

        _stream = new MemoryStream();
        StreamWriter writer = new StreamWriter(_stream);

        using (SqlConnection cnn = new SqlConnection(_cnnString))
        {
            cnn.Open();
            SqlCommand cmd = new SqlCommand(_sqlString, cnn);

            SqlDataReader reader = cmd.ExecuteReader();

            int fieldCount = reader.FieldCount;
            int rowNum = 0;
            UnicodeEncoding encode = new UnicodeEncoding();
            List<string> fields = new List<string>();
            for (int i = 0; i < fieldCount; i++)
            {
                fields.Add(reader.GetFieldType(i).Name);
            }
            while (reader.Read())
            {
                LiteRowInfo[] rowData = new LiteRowInfo[fieldCount];
                for (int i = 0; i < fieldCount; i++)
                {
                    LiteRowInfo info;
                    if (reader[i] != DBNull.Value)
                    {
                        byte[] b;
                        switch (fields[i])
                        {
                            case "Int32":
                                b = BitConverter.GetBytes(reader.GetInt32(i));
                                break;
                            case "Int64":
                                b = BitConverter.GetBytes(reader.GetInt64(i));
                                break;
                            case "DateTime":
                                DateTime dt = reader.GetDateTime(i);
                                b = BitConverter.GetBytes(dt.ToBinary());
                                break;
                            case "Double":
                                b = BitConverter.GetBytes(reader.GetDouble(i));
                                break;
                            case "Boolean":
                                b = BitConverter.GetBytes(reader.GetBoolean(i));
                                break;
                            case "Decimal":
                                b = BitConverter.GetBytes((float)reader.GetDecimal(i));
                                break;
                            default:
                                b = encode.GetBytes(reader.GetString(i));
                                break;
                        }
                        int len = b.Length;

                        info.Position = currentPos += len;
                        info.Length = len;
                        info.Field = i;
                        info.Row = rowNum;
                        currentPos += len;
                        _stream.Write(b, 0, len);
                    }
                    else
                    {
                        info.Position = currentPos;
                        info.Length = 0;
                        info.Field = i;
                        info.Row = rowNum;
                    }
                    rowData[i] = info;
                }
                rowNum++;
                pos.Add(rowData);
            }
        }
    }
6
  • 4
    Your struct has 20 bytes of overhead; any class would only have IntPtr-sized overhead. Commented Nov 17, 2011 at 17:19
  • well a struct with all members as you have shown needs to be created first, or can you show the code where you create and add the structs to the list? Commented Nov 17, 2011 at 17:20
  • Probably because your code requires boxing the structs before they are added to the List. Try sharing the code that adds to the List. Commented Nov 17, 2011 at 17:20
  • @driis: The structs aren't being added to the list - an array reference is being added to the list. Commented Nov 17, 2011 at 17:24
  • 1
    I hope this isn't the code on which you're basing your claim that List<T>.Add is slower for you - you've got huge amounts of other code in use there, including database access. If you have a more targeted benchmark showing List<T>.Add taking longer when T is LiteRowInfo[] than when it's string[], please post that code. Otherwise we'll just have to assume you were actually measuring entirely different code paths, populating the arrays in different ways. Commented Nov 17, 2011 at 17:36

2 Answers 2

8

Given that the array itself is a reference type, I very much doubt that you're actually seeing what you think you're seeing.

I suspect the difference isn't in adding an array reference to a list - I suspect it's creating the array in the first place. Each array element will take more space than a reference, so you're having to allocate more memory. That may well mean you're also triggering garbage collection.

To benchmark just List<T>.Add, I suggest that you repeatedly add a reference to the same array several times.

As an aside, having an array as the list element type feels like a bit of a smell to me. There are times when that's valid, but personally I would consider whether it's actually something which could be encapsulated in another type.

EDIT: You say you've posted all the relevant code, but that really isn't benchmark code for List<T>.Add - it contains database access for one thing, which is almost certainly taking way longer than any of the in-memory manipulation!

Sign up to request clarification or add additional context in comments.

3 Comments

+1: The reason commenting out pos.Add(rowInfo) makes things go faster is that the compiler is able to recognize the rowInfo is not used so it optimizes away the creation of the struct array.
Thanks Jon, I wasn't seeing what I think I was seeing :).
@StriplingWarrior - good explanation for the large performance gain, thanks.
0

There could be some boxing happening in the code which is not related to the List<> since generic Lists handle value types without boxing. Unless sharing the code, cannot help.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.