2

this is the C program under Linux/GUN:

#include<stdio.h>
#include<sys/time.h>
#define Max 1024*1024

int main()
{
    struct timeval start,end;
    long int dis;
    int i;
    int m=0;
    int a[Max];
    gettimeofday(&start,NULL);
    for(i=0;i<Max;i += 1){
            a[Max] *= 3;    
    }   
    gettimeofday(&end,NULL);
    dis = end.tv_usec - start.tv_usec;
    printf("time1: %ld\n",dis);

    gettimeofday(&start,NULL);
    for(i=0;i<Max;i += 16){
            a[Max] *= 3;
    }
    gettimeofday(&end,NULL);
    dis = end.tv_usec - start.tv_usec;
    printf("time2: %ld\n",dis);

    return 0;
}

the output:

time1: 7074

time2: 234

it's a big distance

this Java program:

public class Cache1 {
public static void main(String[] args){
    int a[] = new int[1024*1024*64];

    long  time1 = System.currentTimeMillis();
    for(int i=0;i<a.length;i++){
        a[i] *= 3;
    } 
    long time2 = System.currentTimeMillis();
    System.out.println(time2 - time1);

    time1 = System.currentTimeMillis();
    for(int i=0;i<a.length;i += 16){
        a[i] *= 3;
    }
    time2 = System.currentTimeMillis();
    System.out.println(time2 - time1);
}
}

the output:

92

82

it's nealy the same

with the CPU Cache. why they hava so much difference? the Cpu Cache is invalid in C programing?

3
  • To find the execution time,go for System.nanoTime() instead of System.currentTimeMillis() Commented Nov 9, 2013 at 8:12
  • 3
    In your C code the statement a[Max] *= 3 will modify memory past the end of the array. Did you mean a[i]? Commented Nov 9, 2013 at 8:26
  • Yes it was a mistake,a[i] is correct. Commented Nov 13, 2013 at 8:31

2 Answers 2

5

I hope you realize that the difference in units of time in those tests is 10^3. C code is order of magnitude faster than Java code.

In C code there should be a[i] instead of a[Max].

As for cache: since you access only one memory location in your C code (which triggers undefined behavior) your C test is completely invalid.

And even if it were correct, your method is flawed. It is quite possible that the multiplication operations and even the whole loops were skipped completely by C copiler, since nothing depends on their outcome.

The result where first run takes long, and the second takes less time, is expected. Data has to be loaded to cache anyway, and that takes time. Once it is loaded, operations on that data take less time.

Java may either not use cache at all (not likely) or preload the whole array to cache even before the loops are executed. That would explain equal execution times.

Sign up to request clarification or add additional context in comments.

Comments

3

You have three cache sizes, these are typically

  • L1: 32 KB (data), 4 clock cycles
  • L2: 256KB, 10-11 clock cycles
  • L3: 3-24 MB. 40 - 75 clock cycles.

Anything larger than this will not fit into the cache as if you just scroll through memory it will be like they are not there.

I suggest you write a test which empirically works out the CPU cache sizes as a good exercise to help you understand this. BTW You don't need to use *= to exercise the cache as this exercises the ALU. Perhaps there is a simpler operation you can use ;)

In the case of your Java code, most likely it is not compiled yet so you are seeing the speed of the interperator, not the memory accesses.

I suggest you run the test repeatedly on smaller memory sizes for at least 2 seconds and take the average.

3 Comments

+1 for pointing out that the Java results are most likely bogus.
@StephenC It is correct if in the real use case you are only going to call this method once for about this data set size. In that case, you most likely don't care about performance esp if you are using a selection sort.
@PeterLawrey Thanks, I will have a try about your answer

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.