How to optimize a numpy loop that sums values from an array which is indexed by another array where values equal the loop index

Question

I have this piece of code that is called multiple times during the run of the application. It takes an array of numbers which represent values (value_array). These should be summed up in zones, which are defined in the zone_array. zone_ids represents a list of all the possible zones in zone_array.

Its basically something in the lines of: i got a population raster map and i want to know how many people live in each zone of the zone map.

the code:

values = np.zeros(len(zone_ids))
for i in zone_ids:
    values[i] = round(np.nansum(value_array[zone_array == i]), 2)
return values

The culprit seems to be the for loop, but i have not found a way to eliminate it and have the same results.

I tried it with bincount but i did not succeed. Using numba jit also has no effect.

I would like to stay away from cython as this code will be used in a Qgis plugin which has no cython support.

test code:

import numpy as np


def fill_values(zone_array, value_array, zone_ids):
    values = np.zeros(len(zone_ids))
    for i in zone_ids:
        values[i] = round(np.nansum(value_array[zone_array == i]), 2)
    return values


def run():
    # 300 different zones
    zone_ids = range(300)
    # zone map with 300 zones
    zone_array = (np.random.rand(2000, 2000) * 300).astype(int)
    # value map from which we want the sum of values per zone (real map can have NaN values)
    value_array = (np.random.rand(2000, 2000) * 10.)
    value_array[5, 5] = np.NAN
    fill_values(zone_array, value_array, zone_ids)


if __name__ == '__main__':
    run()

1.92 s ± 17.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

With the implementation of bincount as suggested by Divakar :

203 ms ± 15.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The culprit is not the for-loop. Instead, the problem is the comparison zone_array==i within. All 2000x2000=4e6 values have to be checked for equality to i for each zone_id i. — Chickenmarkus
– Chickenmarkus, Commented Oct 18, 2017 at 13:27
if i reduce the amount of zone id's i get a speed increase, so the for loop is still involved in the performance issue. And since i have no alternative that i know of for not doing the zone_array==i i focus on the loop. The best would be that i could somehow use zone_array == zone_ids and skip the loop. — lorenz h
– lorenz h, Commented Oct 18, 2017 at 13:42
You can broadcast the comparison with zone_array[:,:,None] == zone_ids, but that still leaves indexing in the for loop and doesn't give much of an improvement in performance. — user2699
– user2699, Commented Oct 18, 2017 at 17:48

Divakar · Accepted Answer · 2017-10-18 18:00:10Z

1

With a direct usage of bincount, you would have NaNs in the summations. So, you can simply replace the NaNs with zeros and use bincount. This should be much faster, being a vectorized solution.

Hence, the implementation would be -

val_nonan = np.where(np.isnan(value_array), 0, value_array)
out = np.round(np.bincount(zone_array.ravel(), val_nonan.ravel()),2)

answered Oct 18, 2017 at 18:00

Divakar

222k19 gold badges273 silver badges374 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

lorenz h Over a year ago

This works for my problem. Thanks a lot. I guess my bincount tries where messed up by the nan values. Additionally values = out[zone_ids] for the case where you want the results of a subset of zones.

Collectives™ on Stack Overflow

How to optimize a numpy loop that sums values from an array which is indexed by another array where values equal the loop index

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related