2

I am writing a program that requires me to know the execution time of a function, given any number of loops. Upon fine tuning I discovered that the execution time is non-linear, and I would like to know why. I rewrote the code into something I could share on here, and ran simulations on it and got pretty much the same results. Here is the code:

void DummyLoop(long y)
{
    long counter = 0;
    long DummyArray[3];
    char loopArray;
    
    while(counter < y)
    {
        for(loopArray = 0; loopArray < 3; loopArray++)
        {
            DummyArray[x] = 10000/50; // Some int division operation
        }
        counter++;
    }
}

I measured the results using an internal timer of the MCU, and call the loop like this:

T0CON0bits.EN = 1; // Start timer
DummyLoop(10000);
T0CON0bits.EN = 0; // Stop timer

The results are then printed over UART once the timer is stopped. These are the times I measured executing the same function with a growing number of loops. All these results were measured individually for different loop count inputs.

enter image description here

I added a column in there multiplying the time it took for one loop by the amount of loops for that test, just to show how the results change. It does taper out after about 1000 loops, but why does the result change so much? I would have thought the time it takes to run one loop could be multiplied for any number? The MCU I am using is an 8-bit PIC18F47Q43 for reference, compiled with XC8.

5
  • 3
    If you build with optimizations enabled (which you should, otherwise benchmarking is rather useless), then the whole function will likely be empty (or maybe not called at all), since there's no visible side-effects outside of the function. Commented Jun 2, 2024 at 13:12
  • Everything looks pretty linear so it’s probably the overhead of calling the function and any other code that is outside the loop skews your timing for the “1” result and you not taking that into account when you multiply it up. Commented Jun 2, 2024 at 13:17
  • What are you going to use the meansurement for? Commented Jun 2, 2024 at 13:31
  • 1
    10000/50 is a constant expression. It will be evaluated at compile time and replaced with 200. Commented Jun 2, 2024 at 19:28
  • If your compiler is mediocre or better (and you enabled optimizations, obviously) then the execution time will be approximately 0 seconds, or rather your GPIO pin toggle time. Are there any good compilers for the target or do you have to use Microchip's joke of a compiler? Consider porting to a real one. Commented Jun 3, 2024 at 6:36

2 Answers 2

6

You cannot simply multiply the number of loops with some constant, since you don't take the overhead of the rest of the code into account.

Instead, the correct formula is: time(loops) = loops * K1 + K2.

We have the times for 1 and for 2 loops. This gives us the constant K1.

K1 = time(2) - time(1) = 0.017625 ms - 0.0100625 ms = 0.0075625 ms

With this result we can calculate the constant K2.

K2 = time(1) - 1 * K1 = 0.0100625 ms - 1 * 0.0075625 ms = 0.0025 ms

Let's look at results again.

Loops Measured Calculated Delta
1 0.0100625 0.0100625 0
2 0.017625 0.017625 0
3 0.0251875 0.0251875 0
10 0.078125 0.078125 0
50 0.380625 0.380625 0
100 0.75875 0.75875 0
500 3.78375 3.78375 0
1000 7.565 7.565 0
5000 37.814 37.815 0.001
10000 75.626 75.6275 0.0015

Looks good, doesn't it? The differences for the last two values are certainly rounding errors of the used output routine. Calculations with floats are commonly limited to about 6 non-zero digits.


Note 1:

K1 is the time for one loop:

  • Checking the outer loop condition (positively)
  • The inner for loop inside the outer loop
  • The dummy calculation and assignment

K2 is the time of the one-time overhead:

  • Start/stop the timer (details depend on the exact working of the timer)
  • Call and return of the function
  • Prologue and epilogue in the function
  • Quitting the outer loop
K1  K2
    x   T0CON0bits.EN = 1; // Start timer
    x   DummyLoop(10000);
    x   T0CON0bits.EN = 0; // Stop timer

    x   void DummyLoop(long y)
    x   {
    x       long counter = 0;
    x       long DummyArray[3];
    x       char loopArray;
    
x   x       while(counter < y)
x           {
x               for(loopArray = 0; loopArray < 3; loopArray++)
x               {
x                   DummyArray[x] = 10000/50; // Some int division operation
x               }
x               counter++;
x           }
    x   }

Note 2:

You can sum these times from the generated machine code.

Or if this is too complicated, use a port pin to toggle and an oscilloscope to measure the interval at this pin.

Sign up to request clarification or add additional context in comments.

1 Comment

This is exactly what I was looking for, thank you! What exactly do K1 and K2 represent?
0

This is 8bit MCU, so, if you are having your code without scheduler, then approximately without any interrrupts, you can get almost same execution time for every loop. However, if there is scheduler and any interrupt source interfering, then execution of single loop will take different time every time it is executed. if t1 is time taken to execute 10000 loops, you can get better result with average of value of t1/10000 for single loop. According to the result, there is one loop time multiplied by how many loops, if you took it through execution it may not be accurate or if you calculated by use of timings of instructions, then scheduler/interrupt sources may interfere with loop execution. Pls specify what is the one loop timing reference taken from.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.