-1

I came across a weird behavior with numpy, for which I do not have a good explanation. In one function, I do some operations on a copied numpy array. There is no issue when I run this code once. However, when I call the function with the same input in a for loop as example I get the following error:

RuntimeWarning: invalid value encountered in cast histogram 
= ((histogram / histogram.sum()) * INT_CASTER).astype(np.int64)

The histogram is a m x m shaped ndarray with np.float64 as the dtype. histograms is a n x m x m tensor, containing all individual histograms. Each histogram sums to > 0, so division by 0 error would be quite weird. These histograms are generated each function call from external data. I am not sure how to avoid this behavior. I also tried del histograms as a way to maybe avoid this error. This only helped partially, where I can now call the function a few times consecutively without getting the error, however with more than 5 calls this error appears again. Without the del the error appears after two iterations already.

The function is as follows:

def _normalize_cast_int(histograms: NDArray[np.float64]) -> NDArray[np.int64]:
    int_histograms = np.empty(histograms.shape, dtype=np.int64)

    instances = histograms.shape[0]
    for i in range(instances):
        histogram = histograms[i].copy()
        histogram = ((histogram / histogram.sum()) * INT_CASTER).astype(np.int64)

        error = INT_CASTER - histogram.sum()
        mask = np.zeros(shape=histogram.size, dtype=np.int64)
        mask[:error] += 1
        np.random.seed(42)
        np.random.shuffle(mask)
        int_histograms[i] = histogram + np.reshape(mask, (-1, histogram.shape[0]))

    return int_histograms

9
  • If the sum were 0, you'd end up with NaNs, which cannot be cast to integers. Commented Nov 16, 2023 at 21:01
  • The sum should never change, since i generate the histograms n times from the same data. The data itself is not changed, so a sum that is 0 should not be possible. Commented Nov 16, 2023 at 21:04
  • 1
    It will be easier for someone to help you if you provide a minimal reproducible example. Commented Nov 16, 2023 at 21:04
  • 1
    And you're sure that no ROW of the histogram sums to 0? Commented Nov 16, 2023 at 21:30
  • 2
    "An individual row can sum to 0." You have histogram = histograms[i].copy(), so histogram will be a row of histograms. Then histogram / histogram.sum() will result in 0/0 when a row of histograms sums to 0 . Commented Nov 16, 2023 at 22:27

1 Answer 1

2

I also thought the error throws, because the sum is applied over an empty array. sum function uses an iteration method to get a result and it can't be iterated over an empty array.

However, the histogram may contain random values, other than zero, so this could be at random, first time works, another time not. But when a value overflows the allowed range for an int64 number, an error could be also thrown.

Getting myself such an error, I had tried to convert values using the int() function, or float() in my case, and all Runtime Warnings disappeared. These function are acting like a cast function, that limits the values to the accepted range.

You can also convert the values in the for loop, at the cost of an extra function call. Another possibility would be to check the valid value in a try statement (although there would be several extra statements) and if a NaN value it will be encountered, you could use a continue statement.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.