2

I am sorting millions of structs organzied in an array with the qsort-function of the standard c library. I tried to optimize the performance by creating an array of pointers of the struct with the same length. In contrast to my expectations the execution time of the second variant is slower:

qsort an array of structs: 199s qsort an array of pointers of structs: 204

I expected that the time for swapping pointer blocks in the memory would be faster than moving structs (size 576). May I have any performance leaks or is this a known behaviour?

5
  • You have to measure it, via call to time(3) before and after the sort method is called Commented Jul 25, 2016 at 19:24
  • Is it possible that sorting an array of structs with qsort already swaps the pointers and not the structs? Commented Jul 25, 2016 at 19:28
  • 1
    Also 5 seconds is a 2.5% difference, which may be within your margin of error. Commented Jul 25, 2016 at 19:31
  • No, qsort will move the structs (if that's what you told it to do). You need to show the code. In particular, if the time is spent in the comparison function is large compared to the time to move a structure, then the pointer array won't help anything. Commented Jul 25, 2016 at 19:31
  • 1
    Post the code. Without code (and showing us what exactly you measured) it is pointless. Commented Jul 25, 2016 at 20:33

4 Answers 4

6

There are other issues here.

By creating the array of pointers you are fragmenting the memory. Algorithms in the standard libraries are designed to optimise the sorting of contiguous arrays, so by doing this you are probably missing the cache far more often than if you just had a bigger array.

Quicksort in particular is quite good for locality of reference, as you halve the sample size and so eventually you are sorting subsets of the original array in chunks that can completely fit into your cache.

As a general rule, cache misses are an order of magnitude slower than hits. As a result this time delay could be significant enough to make up for the speed up you get by not copying all the bytes.

Sign up to request clarification or add additional context in comments.

Comments

2

The way quicksort works, it gradually re-organizes the array by placing neighboring elements closer together. This allows the data cache to work more efficiently the closer the algorithm gets towards the final result.

If you convert to an array of pointers, then the data accesses will likely slow down, since the structures maintain their "unsorted" ordering, while their pointers are getting sorted. But, comparing the structures requires following the pointers to their "unsorted" instances, which might cause data cache misses.

To achieve something like what you desire, you can create an indexing structure to your data. The indexing structure would hold the sorting key (or a copy of it).

struct index_type {
    key_type key;
    data_type *data;
};

And now, you would sort an array of index_type instead of an array of pointers to data_type. Since the key is stored in the array itself, you avoid the issue of following pointers to your "unsorted" structures.

Comments

0

I did a quick sanity check using this structure (which has size 576 when int is 32-bit)

struct test
{
    int value;
    char data[572];
};

I initialized a dynamically allocated array of 1 million structs with this code

for ( int i = 0; i < count; i++ )
{
    array[i].value = rand();
    for ( int j = 0; j < 572; j++ )
        array[i].data[j] = rand();
}

And I sorted the array with this code

int compare( const void *ptr1, const void *ptr2 )
{
    struct test *tptr1 = (struct test *)ptr1;
    struct test *tptr2 = (struct test *)ptr2;
    return tptr1->value - tptr2->value;
}

int main( void )
{
    int count = 1000000;
    ...
    qsort( array, count, sizeof(struct test), compare );
    ...
}

The time to initialize the array was 4.3 seconds, and the time to sort the array was 0.9 seconds.

I then modified the code to create an array of pointers to the structures, and sorted the pointer array. The initialization time was still 4.3 seconds (most of the initialization time is due to calling rand() 500 million times). Sorting the pointer array took 0.4 seconds. Sorting the pointer array was more than twice as fast as sorting the structure array directly.

So my conclusion is that your code has some massive inefficiencies that have nothing to do with qsort.

Comments

0

Which is faster will depend, in general, on the size of the structure. For structures that are the same size as a pointer, then it should be obvious that sorting the structures will be faster than sorting pointers to the structures. As the structure size increases, a point will be reached where the reverse is true (imagine sorting an array of 1 MB structures: you'd spend most of your time in memcopy()). Where, exactly, that point lies will depend on things outside the control of the code (cache structure, cache size, etc.). If this is important to you, then you'd best experiment and measure.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.