I came across a weird behavior with numpy, for which I do not have a good explanation.
In one function, I do some operations on a copied numpy array. There is no issue when I run this code once. However, when I call the function with the same input in a for loop as example I get the following error:
RuntimeWarning: invalid value encountered in cast histogram
= ((histogram / histogram.sum()) * INT_CASTER).astype(np.int64)
The histogram is a m x m shaped ndarray with np.float64 as the dtype. histograms is a n x m x m tensor, containing all individual histograms. Each histogram sums to > 0, so division by 0 error would be quite weird.
These histograms are generated each function call from external data. I am not sure how to avoid this behavior. I also tried del histograms as a way to maybe avoid this error. This only helped partially, where I can now call the function a few times consecutively without getting the error, however with more than 5 calls this error appears again. Without the del the error appears after two iterations already.
The function is as follows:
def _normalize_cast_int(histograms: NDArray[np.float64]) -> NDArray[np.int64]:
int_histograms = np.empty(histograms.shape, dtype=np.int64)
instances = histograms.shape[0]
for i in range(instances):
histogram = histograms[i].copy()
histogram = ((histogram / histogram.sum()) * INT_CASTER).astype(np.int64)
error = INT_CASTER - histogram.sum()
mask = np.zeros(shape=histogram.size, dtype=np.int64)
mask[:error] += 1
np.random.seed(42)
np.random.shuffle(mask)
int_histograms[i] = histogram + np.reshape(mask, (-1, histogram.shape[0]))
return int_histograms
histogram = histograms[i].copy(), sohistogramwill be a row ofhistograms. Thenhistogram / histogram.sum()will result in 0/0 when a row ofhistogramssums to 0 .