3

Lets say that we have a long array named DataList. And we have two other arrays which contains indexes, one contains the indexes in the ascending order (ie: 0, 1, 2, 3, 4, ...) named sIndexes, the other array is made of indexes which are randomly packed (ie: 6, 5, 1, 9, 7, ...) named rIndexes.
These index arrays (sIndexes and rIndexes) are used to index the DataList array. When we use the sIndex array to index DataList, it indexes the elements sequentially. When we use rIndexes to index the DataList, it indexes at random places in the array.

So my question is,
Is there any performance differences when using random indexing over sequential indexing? Isn't cache misses gonna contribute in performance loss (like if the index is pointing to a location which is not available in the cache line)?

2
  • "Isn't cache misses gonna contribute in performance loss" - cache misses will slow things down, so theoretically, if one proves that cache misses will definitely occur, "random indexing" should be slower. However, does it really matter that much? You'll probably get like 2 microsecond performance penalty, if any. Commented Jan 16, 2021 at 13:25
  • Well, yes, if data is not in cache it will take more time to load it. So sequential ordering should be more efficient. But that also depends on size of elements in the DataList - if they are around the size of a cacheline (64 bytes) or more, or alternatively if the indexes are sparse enough so there is no intersection, then there shouldn't be much of a difference. Then again you measure it to see performance difference. Commented Jan 16, 2021 at 13:29

2 Answers 2

1

Linear access is much faster, because:

  • The minimum size to fetch from RAM is one cache-line (usually 64 bytes). If you only use one byte from that, you are wasting precious memory bandwidth.
  • Modern CPUs can detect regular access patterns and will pre-fetch data from RAM, so it will be cached before you even access it. (Or at least it will already be streaming at maximum memory bandwidth.)
  • If linear access is detected at compile-time (probably not the case for your code), it can be vectorized into SIMD instructions, processing many items at once.

Random access will be much slower, unless your whole array fits into L2 cache, or you do a lot of work per item. A L2 cache miss is usually quite expensive.

For details I recommend the classic What Every Programmer Should Know About Memory[1], a long and fascinating read. It's from 2007 but CPU architecture has not fundamentally changed; if anything this has become more relevant.

The above is true for modern, large CPUs. Small embedded CPUs exist that have no cache but predictable SRAM access. In this case random indexing will perform the same.

[1] https://www.gwern.net/docs/cs/2007-drepper.pdf

Sign up to request clarification or add additional context in comments.

Comments

1

What happens "under the hood" is likely a multiplication (index * sizeof(data)) and that takes the same amount of time for the types involved no matter what the values are.

As you yourself point out, actually fetching the value from the array may actually take longer time if the part of the array that is accessed isn't in the current cacheline.

For a single threaded application, one usually wants to pack the data tightly to promote cache hits (see the idea behind std::hardware_constructive_interference_size) but for multithreaded applications one often tries to keep the elements apart (std::hardware_destructive_interference_size) to avoid false sharing and hopefully let each thread have its cacheline undisturbed by the others.

6 Comments

Doesn't gonna affect CPU caching? Like if the index is pointing to a location of the array which is grater than the cache line?
@D-RAJ That's a fair question, and yes, that could actually have an effect. I'm removing this answer. You seem to have this. :-)
I probably should add it to the question :3
@D-RAJ :-) That made it into a more interesting question indeed. Updated the answer.
CPUs have dedicated hardware for array indexing; this multiplication will most likely happen "for free" in parallel with everything else.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.