1

I'm trying to duplicate the first piece of code on this article

http://www.drdobbs.com/parallel/cache-friendly-code-solving-manycores-ne/240012736

Namely:

static volatile int array[Size];
static void test_function(void)
{
    for (int i = 0; i < Iterations; i++)
        for (int x = 0; x < Size; x++)
          array[x]++;
}

I'm running on OS X with an Ivy Bridge processor, and therefore have 64KiB of L1 cache. However, no matter how much I change around the array size, it takes the same amount of time. Here's my code:

#define ARRAY_SIZE 16 * 1024
#define NUM_ITERATIONS 200000

volatile int array[ARRAY_SIZE];

int main(int argc, const char * argv[])
{
    for (int i = 0; i < NUM_ITERATIONS; i++)
        for (int x = 0; x < ARRAY_SIZE; x++)
            array[x]++;
    return 0;
}

Now, according to the logic suggested by the article, array should be 64KiB and utilize all my L1 cache. However, I've tried this with many difference combinations of ARRAY_SIZE (up to 160 * 1024), setting NUM_ITERATIONS accordingly, but every combination about takes the same amount of time.

I'm using gcc -o cachetest cachetest.c to compile, with no other options. Is there some kind of optimization going on that I don't know about, even though volatile is used? Or are there so many parallel processes and context switching that I can't even tell? What's going on here? I'm so confused.

Thanks SO!

3
  • What if... it was running in optimal time to begin with!? tin foil hat Commented Nov 24, 2013 at 10:37
  • A quick look at the assembler would tell you whether the compiler had, for instance, optimised this code away to nothing. Commented Nov 24, 2013 at 10:55
  • Probably the processor preloads the data? We are accessing it perfectly sequentially. Don't take my word on this, though. Commented Dec 21, 2013 at 3:51

1 Answer 1

1

There are 2 things:

  • Compiler may do some default optimization to your code
  • Your code does not use array in any other code/functions, it only increment the array value inside loop, so compiler may optimize it more by changing your program to do nothing (just return 0), which is still correct.

I recommend to:

  • Add more code inside the loop so the compiler will not eliminate your code, for example: printf the array value, or add the array value to a sum variable then print the sum variable at the end of the loop.
  • Turn off all compiler optimization when compiling by using -O0 option.
  • Check the assembly file of the code generated by compiler by using -S option
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.