3

I have some large arrays of 2D data elements. A and B aren't equally sized dimensions.

A) is between 5 and 20

B) is between 1000 and 100000

The initialization time is no problem as its only going to be lookup tables for realtime application, so performance on indexing elements from knowing value A and B is crucial. The data stored is currently a single byte-value.

I was thinking around these solutions:

byte[A][B] datalist1a;

or

byte[B][A] datalist2a;

or

byte[A,B] datalist1b;

or

byte[B,A] datalist2b;

or perhaps loosing the multidimension as I know the fixed size and just multiply the to values before looking it up.

byte[A*Bmax + B] datalist3;

or

byte[B*Amax + A] datalist4;

What I need is to know, what datatype/array structure to use for most efficient lookup in C# when I have this setup.

Edit 1 the first two solutions were supposed to be multidimensional, not multi arrays.

Edit 2 All data in the smallest dimension is read at each lookup, but the large one is only used for indexing once at a time.

So its something like - Grab all A's from sample B.

17
  • 2
    Probably depends how you're stepping through the array when looking up values. Commented Aug 3, 2011 at 10:17
  • 1
    Have you also considered a two dimensional array: byte[A,B]? (The form byte[A][B] is an array of arrays.) Commented Aug 3, 2011 at 10:18
  • 1
    The first three rules of performance optimizations: Measure, measure, measure. Set up some kind of test environment that matches the production environment, and test them. Commented Aug 3, 2011 at 10:19
  • 4
    There is no single correct answer to this question in its current form. As @George mentions, it depends on how you access the array. Are you doing random access? sequential? If sequential, in which direction (ie. A-dimension or B-dimension) primarily? The best way to handle this is to write different versions of the code and profile it, and that's in fact the correct answer to any such "what is fastest" type questions anyway. Commented Aug 3, 2011 at 10:19
  • 1
    @Richard Using multidimensional arrays in C# has a significant impact on its performance, they are implemented differently:codeproject.com/KB/dotnet/arrays.aspx Commented Aug 3, 2011 at 10:23

5 Answers 5

2

I'd bet on the jagged arrays, unless the Amax or Bmax are a power of 2.

I'd say so, because a jagged array needs two indexed accesses, thus very fast. The other forms implies a multiplication, either implicit or explicit. Unless that multiplication is a simple shift, I think could be a bit heavier than a couple of indexed accesses.

EDIT: Here is the small program used for the test:

class Program
{
    private static int A = 10;
    private static int B = 100;

    private static byte[] _linear;
    private static byte[,] _square;
    private static byte[][] _jagged;



    unsafe static void Main(string[] args)
    {
        //init arrays
        _linear = new byte[A * B];
        _square = new byte[A, B];
        _jagged = new byte[A][];
        for (int i = 0; i < A; i++)
            _jagged[i] = new byte[B];

        //set-up the params
        var sw = new Stopwatch();
        byte b;
        const int N = 100000;

        //one-dim array (buffer)
        sw.Restart();
        for (int i = 0; i < N; i++)
        {
            for (int r = 0; r < A; r++)
            {
                for (int c = 0; c < B; c++)
                {
                    b = _linear[r * B + c];
                }
            }
        }
        sw.Stop();
        Console.WriteLine("linear={0}", sw.ElapsedMilliseconds);

        //two-dim array
        sw.Restart();
        for (int i = 0; i < N; i++)
        {
            for (int r = 0; r < A; r++)
            {
                for (int c = 0; c < B; c++)
                {
                    b = _square[r, c];
                }
            }
        }
        sw.Stop();
        Console.WriteLine("square={0}", sw.ElapsedMilliseconds);

        //jagged array
        sw.Restart();
        for (int i = 0; i < N; i++)
        {
            for (int r = 0; r < A; r++)
            {
                for (int c = 0; c < B; c++)
                {
                    b = _jagged[r][c];
                }
            }
        }
        sw.Stop();
        Console.WriteLine("jagged={0}", sw.ElapsedMilliseconds);

        //one-dim array within unsafe access (and context)
        sw.Restart();
        for (int i = 0; i < N; i++)
        {
            for (int r = 0; r < A; r++)
            {
                fixed (byte* offset = &_linear[r * B])
                {
                    for (int c = 0; c < B; c++)
                    {
                        b = *(byte*)(offset + c);
                    }
                }
            }
        }
        sw.Stop();
        Console.WriteLine("unsafe={0}", sw.ElapsedMilliseconds);

        Console.Write("Press any key...");
        Console.ReadKey();
        Console.WriteLine();
    }
}
Sign up to request clarification or add additional context in comments.

14 Comments

Amax or Bmax is just the length of the two dimensions, and they will not be the power of 2 very often (only if we are lucky from time to time). So okay, jagged it is.
@BerggreenDK remember to cache the current line in the first loop to reduce indexer operations, if iterating in order. E.g. for(...) { var line = arr[i]; for (...) { var elem = line[j]; } }
Although Jagged arrays are good for maintainable data structures it isn't as simple as an indexer operation - remember the CLR has to do null checking, bounds checking, etc. This adds a significant amount of overhead to each indexer you do. Algebraic arrays would be faster because of this (but less maintainable).
FYI, I've added the source of the program used for the test.
@MarioVernari: Oh no, it's not the code you posted which is wrong. I was referring to your comment (no.5 from the top) "Anyway, it does not matter which array is the first: it's an access that costs in equal manner." which is obviously wrong, because it does matter which array is first :)
|
2
  • Multidimensional ([,]) arrays are nearly always the slowest, unless under a heavy random access scenario. In theory they shouldn't be, but it's one of the CLR oddities.
  • Jagged arrays ([][]) are nearly always faster than multidimensional arrays; even under random access scenarios. These have a memory overhead.
  • Singledimensional ([]) and algebraic arrays ([y * stride + x]) are the fastest for random access in safe code.
  • Unsafe code is, normally, fastest in all cases (provided you don't pin it repeatedly).

4 Comments

"Unsafe code is, obviously, the fastest in all cases". I don't think so. Try yourself, and you'll be surprised (C# speaking).
You skip bounds checking; which is the slowest part of arrays.
That's right, but the unsafe way requires the array to be pinned with the "fixed" statement, which is slow. Also, well, it depends on how you want to have access to the array: bitmap access (so fixed once), or random access (fixed every time).
@Mario - Fixed doesn't actually pin the memory, it uses a clever trick to avoid it. In spite of the documentation. No bounds checking either, it is fast and dangerous with perf like C arrays.
1

The only useful answer to "which X is faster" (for all X) is: you have to do performance tests that reflect your requirements.

And remember to consider, in general*:

  • Maintenance of the program. If this is not a quick one off, a slightly slower but maintainable program is a better option in most cases.
  • Micro benchmarks can be deceptive. For instance a tight loop just reading from a collection might be optimised away in ways not possible when real work is being done.

Additionally consider that you need to look at the complete program to decide where to optimise. Speeding up a loop by 1% might be useful for that loop, but if it is only 1% of the complete runtime then it is not making much differences.


* But all rules have exceptions.

1 Comment

This is good general advice, but doesn't answer the question which is quite specific about optimizing performance of accesses to 2-dimensional chunks of data in C#, which has quite a specific answer.
0

On most modern computers, arithmetic operations are far, far faster than memory lookups. If you fetch a memory address that isn't in a cache or where the out of order execution pulls from the wrong place you are looking at 10-100 clocks, a pipelined multiply is 1 clock. The other issue is cache locality. byte[BAmax + A] datalist4; seems like the best bet if you are accessing with A's varying sequentially. When datalist4[bAmax + a] is accessed, the computer will usually start pulling in datalist4[bAmax + a+ 64/sizeof(dataListType)], ... +128 ... etc, or if it detects a reverse iteration, datalist4[bAmax + a - 64/sizeof(dataListType)]

Hope that helps!

Comments

-2

May be best way for u will be use HashMap

Dictionary?

1 Comment

too complex a datastructure, considering our need of simple bytearrays and an index.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.