1

This is sort of a silly question, but it's been bothering me and I couldn't google-fu my way over it.

Consider the following array:

struct SomeDataStruct
{
    uint64_t ValueOne;
    uint64_t ValueTwo;
    uint64_t ValueThree;
};

SomeDataStruct _veryLargeArray[1024];

Now, which of these approaches are faster to loop over every element and do something with each one?

Approach 1:

for (int i = 0; i < 1024; ++i)
{
    _veryLargeArray[i].ValueOne += 1;
    _veryLargeArray[i].ValueTwo += 1;
    _veryLargeArray[i].ValueThree = _veryLargeArray[i].ValueOne + _veryLargeArray[i].ValueTwo;
}

Approach 2:

SomeDataStruct * pEndOfStruct = &(_veryLargeArray[1024]);

for (SomeDataStruct * ptr = _veryLargeArray; ptr != pEndOfStruct; ptr += 1)
{
    ptr->ValueOne += 1;
    ptr->ValueTwo += 1;
    ptr->ValueThree = ptr->ValueOne + ptr->ValueTwo;
}

I know the question seems really stupid on its surface, but what I'm wondering is does the compiler do anything smart/special with each given way of implementing the for loop? In the first case, it could be really memory intensive if the compiler actually looked up BaseArrayPointer + Offset every time, but if the compiler is smart enough if will fill the L2 cache with the entire array and treat the code between the { }'s correctly.

The second way gets around if the compiler is resolving the pointer every time, but probably makes it real hard for a compiler to figure out that if could copy the entire array into the L2 cache and walk it there.

Sorry for such a silly question, I'm having a lot of fun learning c++ and have started doing that thing where you overthink everything. Just curious if anyone knew if there was a "definitive" answer.

4
  • What platform are you asking about? On the major desktop platforms, it isn't the compilers job to "copy the array to L2", there's MMU hardware for that. Commented Jul 26, 2019 at 3:32
  • 2
    In some cases, the compiler may even generate the same code for both loops. You shouldn't notice any performance difference. Furthermore, for readability, I recommend you use a range for-loop over either of these options. Commented Jul 26, 2019 at 3:35
  • Each compiler has its own set of optimizations. The only way to get a definitive answer for your compiler on your architecture is for you to profile the code. (Well, if the loops happen to compile to the same machine code, I guess you don't have to actually run it to know they have the same performance.) Commented Jul 26, 2019 at 3:46
  • in this case using a structure or arrays will make it easier to vectorize the code Commented Jul 26, 2019 at 9:28

1 Answer 1

3

Unless you want to look at the intermediate assembly language output and analyze the caching behaviour of the CPU, the only way you'll be able to answer this question is to profile the code. Run it, hundreds or thousands of times and see how long it takes.

If you want the fastest code, write the simplest, most obvious version and leave it to the optimizing compiler. If you try to get fancy, with a loop like this, you risk confusing the compiler and it won't be able to optimize things.

I've seen a simple C loop compile to be faster than hand-coded assembly, and a hand-optimized C version that ended up slower than the hand-coded assembly.

On the other hand it can pay to know a bit about caching and what is going on under the hood. But usually, that happens after you've discovered that your code isn't fast enough. Doing otherwise risks premature optimization, which is the root of all evil.

Sign up to request clarification or add additional context in comments.

1 Comment

I should probably throw in something about the law of leaky abstractions: joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions since it slightly contradicts the answer above. Sometimes you should know what's going on under the hood, avoid an n-squared algorithm when you could do it in O(n), etc.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.