Numpy array float precision is not deterministic

Question

So when running the below snipet

import numpy as np

X = np.array([-0.20000008], dtype=np.float32)
np.floor((X + 1) / 0.04)

array([20.], dtype=float32)

The output is obviously wrong as the result should be below 20 and should floor to 19

I get that this is precision errors but running all below samples produce correct results although it should have similar precision

X = np.array([-0.20000008], dtype=np.float32).item()
np.floor((X + 1) / 0.04) # 19.0

X = np.float32(-0.20000008)
np.floor((X + 1) / 0.04) # 19.0

X = np.array([-0.20000008], dtype=np.float32)
np.floor(X / 0.04 + 1 / 0.04) # array([19.], dtype=float32)
np.floor(np.multiply((X + 1), 1/0.04)) # array([19.], dtype=float32)

If I cast it as float64 it works too but it is very expensive cast for my application. Any solutions while sticking to float32?

What about using np.array([np.floor((x + 1) / 0.04) for x in X]) based on your observations? — Vezen BU
– Vezen BU, Commented Mar 3, 2022 at 6:26

flawr · Accepted Answer · 2022-03-03 10:29:57Z

3

Let's try to understand the first two of the three examples on the bottom first:

In the first example

np.array([-0.20000008], dtype=np.float32).item()

will produce a native python float() which is a 64 bit, so no surprizes here.

In the second example you have create a numpy 32-bit scalar (shape==(), type==np.float32) which will get treated more or less like other scalars: So as soon as you add an int (1), the result will be a 64 bit number.

The interesting case now is actually your initial piece of code and the third example: In both cases you now have an array (shape==(1,), type=np.ndarray). In the case of operations with an array and a scalar, the type of the array will be preserved. So now we actually just have the issue that the distributive law does not hold for floating point numbers. Here you're doing computations that rely on the least significant bits of floating point numbers.

answered Mar 3, 2022 at 10:29

flawr

11.7k4 gold badges49 silver badges83 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Jérôme Richard Over a year ago

Interesting. But how to explain that ((X + np.float32(1)) / 0.04).dtype is float32 and ((X[0] + np.float32(1)) / 0.04).dtype is float64?

flawr Over a year ago

In the first example X is an np.ndarray while in the second one X[0] will be a np.float32 scalar, so it's again a similar situation to what I mentioned in the answer. Note that X[0] is not the same as X.item() here, the former still gets you a numpy object, while the latter will be a native python float!

flawr Over a year ago

So to answer your question, in the first case we have an array and the type of the array is preserved through the operation with scalars, in the second case we hava scalars, among which is a 64 bit float (0.04), the result will be a 64 bit number.

ma7555 Over a year ago

What about np.floor(np.multiply((X + 1), 1/0.04)) which produces 19.0 as well

flawr Over a year ago

@ma7555 I'd suggest dividing it up into all the intermediate results, inspecting the types and then thinking about where exactly you find a difference to what you expect!

|

Collectives™ on Stack Overflow

Numpy array float precision is not deterministic

1 Answer 1

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related