Fast filling of array with numpy.random.uniform

Question

I need to improve as much as possible the performance of the function defined below. It is called millions of times, and the for loop is currently the bottleneck of my code.

def func(d):
    d_arr = []
    for rN in d:
        d_arr.extend(np.random.uniform(rN[0], rN[1], rN[2]))

    return np.asarray(d_arr)

d = [[0.01, 0.11, 3413], [0.11, 0.21000000000000002, 1305], [0.21000000000000002, 0.31000000000000005, 675], [0.31000000000000005, 0.41000000000000003, 439], [0.41000000000000003, 0.51000000000000012, 318], [0.51000000000000012, 0.6100000000000001, 221], [0.6100000000000001, 0.71000000000000008, 151], [0.71000000000000008, 0.81000000000000016, 109], [0.81000000000000016, 0.91000000000000014, 82], [0.91000000000000014, 1.0100000000000002, 64], [1.0100000000000002, 1.1100000000000003, 51], [1.1100000000000003, 1.2100000000000004, 41], [1.2100000000000004, 1.3100000000000003, 34], [1.3100000000000003, 1.4100000000000004, 28], [1.4100000000000004, 1.5100000000000005, 24], [1.5100000000000005, 1.6100000000000003, 21], [1.6100000000000003, 1.7100000000000004, 18], [1.7100000000000004, 1.8100000000000005, 16], [1.8100000000000005, 1.9100000000000004, 14], [1.9100000000000004, 2.0100000000000002, 12], [2.0100000000000002, 2.1100000000000003, 11], [2.1100000000000003, 2.2100000000000004, 10], [2.2100000000000004, 2.3100000000000005, 9], [2.3100000000000005, 2.4100000000000001, 8], [2.4100000000000001, 2.5100000000000002, 7], [2.5100000000000002, 2.6100000000000003, 7], [2.6100000000000003, 2.7100000000000004, 6], [2.7100000000000004, 2.8100000000000005, 6], [2.8100000000000005, 2.9100000000000006, 5], [2.9100000000000006, 3.0100000000000002, 5], [3.0100000000000002, 3.1100000000000003, 4], [3.1100000000000003, 3.2100000000000004, 4], [3.2100000000000004, 3.3100000000000005, 4], [3.3100000000000005, 3.4100000000000006, 4], [3.4100000000000006, 3.5100000000000007, 3], [3.5100000000000007, 3.6100000000000008, 3], [3.6100000000000008, 3.7100000000000004, 3], [3.7100000000000004, 3.8100000000000005, 3], [3.8100000000000005, 3.9100000000000006, 3], [3.9100000000000006, 4.0100000000000007, 2], [4.0100000000000007, 4.1100000000000012, 2], [4.1100000000000012, 4.2100000000000009, 2], [4.2100000000000009, 4.3100000000000014, 2], [4.3100000000000014, 4.410000000000001, 2], [4.410000000000001, 4.5100000000000016, 2], [4.5100000000000016, 4.6100000000000012, 2], [4.6100000000000012, 4.7100000000000009, 2], [4.7100000000000009, 4.8100000000000014, 2], [4.8100000000000014, 4.910000000000001, 2], [4.910000000000001, 5.0100000000000016, 1], [5.0100000000000016, 5.1100000000000012, 1], [5.1100000000000012, 5.2100000000000017, 1], [5.2100000000000017, 5.3100000000000014, 1], [5.3100000000000014, 5.410000000000001, 1], [5.410000000000001, 5.5100000000000016, 1], [5.5100000000000016, 5.6100000000000012, 1], [5.6100000000000012, 5.7100000000000017, 1], [5.7100000000000017, 5.8100000000000014, 1], [5.8100000000000014, 6.0100000000000016, 2], [6.0100000000000016, 6.2100000000000017, 2], [6.2100000000000017, 6.4100000000000019, 2], [6.4100000000000019, 6.6100000000000012, 2], [6.6100000000000012, 6.8100000000000014, 1], [6.8100000000000014, 7.0100000000000016, 1], [7.0100000000000016, 7.2100000000000017, 1], [7.2100000000000017, 7.4100000000000019, 1], [7.4100000000000019, 7.6100000000000021, 1], [7.6100000000000021, 7.8100000000000014, 1], [7.8100000000000014, 8.1100000000000012, 1], [8.1100000000000012, 8.4100000000000019, 1], [8.4100000000000019, 8.7100000000000009, 1], [8.7100000000000009, 9.0100000000000016, 1], [9.0100000000000016, 9.3100000000000005, 1], [9.3100000000000005, 9.7100000000000009, 1], [9.7100000000000009, 10.110000000000001, 1], [10.110000000000001, 10.510000000000002, 1], [10.510000000000002, 11.010000000000002, 1], [11.010000000000002, 11.510000000000002, 1], [11.510000000000002, 12.110000000000001, 1], [12.110000000000001, 12.710000000000003, 1], [12.710000000000003, 13.410000000000002, 1], [13.410000000000002, 14.210000000000003, 1], [14.210000000000003, 15.110000000000003, 1], [15.110000000000003, 16.110000000000003, 1], [16.110000000000003, 17.310000000000002, 1], [17.310000000000002, 18.710000000000004, 1], [18.710000000000004, 20.410000000000004, 1], [20.410000000000004, 22.410000000000004, 1], [22.410000000000004, 24.910000000000004, 1], [24.910000000000004, 28.210000000000004, 1], [28.210000000000004, 32.710000000000008, 1], [32.710000000000008, 39.210000000000008, 1], [39.210000000000008, 49.710000000000008, 1], [49.710000000000008, 70.210000000000008, 1], [70.210000000000008, 133.41000000000005, 1]]

for _ in range(10000):
    d_arr = func(d)

I've tried with:

d = np.array(d).T
d_arr = np.random.uniform(d[0], d[1], d[2])

but it fails with:

ValueError: sequence too large; cannot be greater than 32

I need an improvement at the code level, not parallel processing. — Gabriel
– Gabriel, Commented Aug 30, 2017 at 21:03
The fact that this function is called millions of times is already a warning sign; you may want to change that. — user2357112
– user2357112, Commented Aug 30, 2017 at 23:57
Hard to see the exact task here, but of course you might pre-allocate d_arr as your size is a-priori known and extending is costly! Frequently changing sizes of array-based data-structures is always slow! — sascha
– sascha, Commented Aug 30, 2017 at 23:58
@user2357112 I can not. It is a tiny part of much larger code, but an important one. It is related to a theoretical model that generates synthetic solutions, later on compared with an observation via a likelihood function which is minimized with a genetic algorithm. — Gabriel
– Gabriel, Commented Aug 31, 2017 at 0:00

user2357112 · Accepted Answer · 2017-09-03 22:44:23Z

3

Instead of using np.random.uniform in a loop, generate enough random numbers in one np.random.random call and then scale them appropriately:

# Could be lows, highs, counts = np.array(d).T, except for the mixed dtypes.
# Taking input as 3 arrays of lows, highs, counts would let you skip this step.
lows, highs, counts = zip(*d) if len(d) else ((), (), ())

base = np.repeat(lows, counts)
scale = np.repeat(highs, counts) - base

random_nums = np.random.random(np.sum(counts)) * scale + base

(Aside from calling np.random.uniform in a loop, one of the things slowing down the original is extending a list with a NumPy array. That generates wrapper objects for every element of the array, a slow and needless process.)

edited Sep 3, 2017 at 22:44

answered Aug 31, 2017 at 0:01

user2357112

286k32 gold badges490 silver badges571 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Gabriel Over a year ago

I thinks there's an error with the scaling. Shouldn't it be (scale - base)?

user2357112 Over a year ago

@Gabriel: Whoops, fixed. (You could save some time by subtracting before repeating.)

Collectives™ on Stack Overflow

Fast filling of array with numpy.random.uniform

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related