Fastest way of generating numpy arrays or randomly distributed 0s and 1s

Question

I need to generate masks for dropout for a specific neural network. I am looking at the fastest way possible to achieve this using numpy (CPU only).

I have tried:

def gen_mask_1(size, p=0.75):
    return np.random.binomial(1, p, size)


def gen_mask_2(size, p=0.75):
    mask = np.random.rand(size)
    mask[mask>p]=0
    mask[mask!=0]=1
    return mask

where p is the probability of having 1

The speed of these two approaches is comparable.

%timeit gen_mask_1(size=2048)
%timeit gen_mask_2(size=2048)

45.9 µs ± 575 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
47.4 µs ± 372 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Are there faster methods?

UPDATE

Following the suggestions got so far, I have tested a few extra implementations. I couldn't get @njit to work when setting parallel=True (TypingError: Failed in nopython mode pipeline (step: convert to parfors)), it works without but, I think, less efficiently. I have found a python binding for Intel's mlk_random (thank you @MatthieuBrucher for the tip!) here: https://github.com/IntelPython/mkl_random So far, using mlk_random together with @nxpnsv's approach gives the best result.

@njit
def gen_mask_3(size, p=0.75):
    mask = np.random.rand(size)
    mask[mask>p]=0
    mask[mask!=0]=1
    return mask

def gen_mask_4(size, p=0.75):
    return (np.random.rand(size) < p).astype(int)

def gen_mask_5(size):
    return np.random.choice([0, 1, 1, 1], size=size)

def gen_mask_6(size, p=0.75):
    return (mkl_random.rand(size) < p).astype(int)

def gen_mask_7(size):
    return mkl_random.choice([0, 1, 1, 1], size=size)

%timeit gen_mask_4(size=2048)
%timeit gen_mask_5(size=2048)
%timeit gen_mask_6(size=2048)
%timeit gen_mask_7(size=2048)

22.2 µs ± 145 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
25.8 µs ± 336 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
7.64 µs ± 133 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
29.6 µs ± 1.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Probably not using numpy random number generators. Perhaps by using MKL directly (software.intel.com/en-us/forums/intel-math-kernel-library/topic/…) — Matthieu Brucher
– Matthieu Brucher, Commented Dec 13, 2018 at 12:46
@MatthieuBrucher I am open to alternatives, the mask needs to be a np.array at the end, so I will need to do the conversion anyway. But if you know a faster way of generating random digits in python please share it. — alec_djinn
– alec_djinn, Commented Dec 13, 2018 at 12:48
It might be faster, but nothing for certain. MKL has routines for this (software.intel.com/en-us/…), but there is no example to create the state and call the function from Python. — Matthieu Brucher
– Matthieu Brucher, Commented Dec 13, 2018 at 12:50
This version of gen_mask_2 runs faster than either of your atempts (np.random.rand(size) < p).astype(int) — nxpnsv
– nxpnsv, Commented Dec 13, 2018 at 13:15
@nxpnsv your solution is the fastest so far, please add it as an answer. I can't get numba working though. Still trying that. — alec_djinn
– alec_djinn, Commented Dec 13, 2018 at 13:46

Sheldore · Accepted Answer · 2018-12-13 12:51:08Z

2

You can make use of Numba compiler and make things faster by applying njit decorator on your functions. Below is an example for a very large size

from numba import njit

def gen_mask_1(size, p=0.75):
    return np.random.binomial(1, p, size)

@njit(parallel=True)
def gen_mask_2(size, p=0.75):
    mask = np.random.rand(size)
    mask[mask>p]=0
    mask[mask!=0]=1
    return mask

%timeit gen_mask_1(size=100000)
%timeit gen_mask_2(size=100000)

2.33 ms ± 215 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
512 µs ± 25.1 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

answered Dec 13, 2018 at 12:51

Sheldore

39.2k9 gold badges63 silver badges76 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Warren Weckesser · Accepted Answer · 2018-12-13 13:06:58Z

2

Another option is numpy.random.choice, with an input of 0s and 1s where the proportion of 1s is p. For example, for p = 0.75, use np.random.choice([0, 1, 1, 1], size=n):

In [303]: np.random.choice([0, 1, 1, 1], size=16)
Out[303]: array([1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0])

This is faster than using np.random.binomial:

In [304]: %timeit np.random.choice([0, 1, 1, 1], size=10000)
71.8 µs ± 368 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [305]: %timeit np.random.binomial(1, 0.75, 10000)
174 µs ± 348 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

To handle an arbitrary value for p, you can use the p option of np.random.choice, but then the code is slower than np.random.binomial:

In [308]: np.random.choice([0, 1], p=[0.25, 0.75], size=16)
Out[308]: array([1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0])

In [309]: %timeit np.random.choice([0, 1], p=[0.25, 0.75], size=10000)
227 µs ± 781 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

answered Dec 13, 2018 at 13:06

Warren Weckesser

116k20 gold badges207 silver badges224 bronze badges

1 Comment

Sheldore Over a year ago

I wonder if the hybrid of your answer and mine would be even faster. i.e., using np.random.choice and applying njit decorator

nxpnsv · Accepted Answer · 2018-12-18 18:59:13Z

1

As I said in the comment the question the implementation

def gen_mask_2(size, p=0.75):
    mask = np.random.rand(size)
    mask[mask>p]=0
    mask[mask!=0]=1
    return mask

can be improved, by using that comparison gives an bool which then can be converted to int. This removes the two comparisons with masked assignments you otherwise have, and it makes for a pretty one liner :)

def gen_mask_2(size, p=0.75):
    return = (np.random.rand(size) < p).astype(int)

answered Dec 18, 2018 at 18:59

nxpnsv

1737 bronze badges

Collectives™ on Stack Overflow

Fastest way of generating numpy arrays or randomly distributed 0s and 1s

3 Answers 3

Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related