I was trying to mimic the result of a simple Tensorflow/Keras Dense layer with NumPy (forward pass only) and I was surprised not to have the exact same result.
A dense layer output is just the product between the input vector and the weights (forget the bias here), so I just tried to get the weights from my Dense layer and use them with NumPy. However I get slightly different results than with processing the input vector directy with Tensorflow/Keras.
Here is a minimal reproductible example:
import numpy as np
import keras
from keras import layers
print("Keras version:", keras.__version__)
print("Backend", keras.backend.backend())
# Keras model
ins = layers.Input((2,), name='input')
out = layers.Dense(5, kernel_initializer='random_normal', use_bias=False, name='output')(ins)
shallow_model = keras.Model(inputs=ins, outputs=out)
# Input
x = np.random.random(size=(5, 2)).astype(np.float32)
# Keras output
out_keras = shallow_model.predict(x)
# Get weights of Dense Layer
[kernel] = shallow_model.layers[1].get_weights()
# Try in in Numpy
out_numpy = np.matmul(x, kernel)
# Compare results
print("Keras result:\n", out_keras)
print("Numpy result:\n", out_numpy)
print("Same result:", np.allclose(out_keras, out_numpy))
An example of output:
Keras version: 3.3.3
Backend tensorflow
Keras result:
[[-0.13240188 0.00676447 -0.11455889 0.00669269 0.00392148]
[-0.04194738 -0.01847801 -0.06489066 -0.03474987 -0.0088181 ]
[-0.12778029 -0.00793487 -0.13061695 -0.01940094 -0.00327162]
[-0.07080866 0.0196894 -0.03897876 0.0323153 0.00993819]
[-0.03812894 -0.01733573 -0.05973222 -0.0325517 -0.00827873]]
Numpy result:
[[-0.13241094 0.00675571 -0.11458878 0.00667856 0.00392317]
[-0.04195426 -0.01849106 -0.06492892 -0.03477099 -0.00881922]
[-0.12777513 -0.00794449 -0.13064197 -0.01941477 -0.00326829]
[-0.07078259 0.01968134 -0.03896207 0.03230151 0.00993471]
[-0.0381208 -0.01734212 -0.05974622 -0.03256048 -0.00827707]]
Same result: False
Now I get that the results are close but I was wondering were the difference come from. Any ideas?
Edit: I believe it may have something to do with floating point precision and operation orders as explained here: Numerical errors in Keras vs Numpy . I'd like to have a little more detailed answer if possible, in particular given that the only operation here is matrix multiplication.