As a premise, I am aware that this problem has been addressed already, but never in this specific scenario, from what I could find searching.
In a time-critical piece of code, I have a loop where a float value x must grow linearly from exactly 0 to-and-including exactly 1 in 'z' steps.
The un-optimized solution, but which would work without rounding errors, is:
const int z = (some number);
int c;
float x;
for(c=0; c<z; c++)
{
x = (float)c/(float)(z-1);
// do something with x here
}
obviously I can avoid the float conversions and use two loop variables and caching (float)(z-1):
const int z = (some number);
int c;
float xi,x;
const float fzm1 = (float)(z-1);
for(c=0,xi=0.f; c<z; c++, xi+=1.f)
{
x=xi/fzm1;
// do something with x
}
But who would ever repeat a division by a constant for every loop pass ? Obviously anyone would turn it into a multiplication:
const int z = (some number);
int c;
float xi,x;
const float invzm1 = 1.f/(float)(z-1);
for(c=0,xi=0.f; c<z; c++, xi+=1.f)
{
x=xi * invzm1;
// do something with x
}
Here is where obvious rounding issues may start to manifest. For some integer values of z, (z-1)*(1.f/(float)(z-1)) won't give exactly one but 0.999999..., so the value assumed by x in the last loop cycle won't be exactly one.
If using an adder instead, i.e
const int z = (some number);
int c;
float x;
const float x_adder = 1.f/(float)(z-1);
for(c=0,x=0.f; c<z; c++, x+=x_adder)
{
// do something with x
}
the situation is even worse, because the error in x_adder will build up.
So the only solution I can see is using a conditional somewhere, like:
const int z = (some number);
int c;
float xi,x;
const float invzm1 = 1.f/(float)(z-1);
for(c=0,xi=0.f; c<z; c++, xi+=1.f)
{
x = (c==z-1) ? 1.f : xi * invzm1;
// do something with x
}
but in a time-critical loop a branch should be avoided if possible !
Oh, and I can't even split the loop and do
for(c=0,xi=0.f; c<z-1; c++, xi+=1.f) // note: loop runs now up to and including z-2
{
x=xi * invzm1;
// do something with x
}
x=1.f;
// do something with x
because I would have to replicate the whole block of code 'do something with x' which is not short or simple either, I cannot make of it a function call (would be inefficient, too many local variables to pass) nor I want to use #defines (would be very poor and inelegant and impractical).
Can you figure out any efficient or smart solution to this problem ?