1

I need to generate masks for dropout for a specific neural network. I am looking at the fastest way possible to achieve this using numpy (CPU only).

I have tried:

def gen_mask_1(size, p=0.75):
    return np.random.binomial(1, p, size)


def gen_mask_2(size, p=0.75):
    mask = np.random.rand(size)
    mask[mask>p]=0
    mask[mask!=0]=1
    return mask

where p is the probability of having 1

The speed of these two approaches is comparable.

%timeit gen_mask_1(size=2048)
%timeit gen_mask_2(size=2048)

45.9 µs ± 575 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
47.4 µs ± 372 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Are there faster methods?

UPDATE

Following the suggestions got so far, I have tested a few extra implementations. I couldn't get @njit to work when setting parallel=True (TypingError: Failed in nopython mode pipeline (step: convert to parfors)), it works without but, I think, less efficiently. I have found a python binding for Intel's mlk_random (thank you @MatthieuBrucher for the tip!) here: https://github.com/IntelPython/mkl_random So far, using mlk_random together with @nxpnsv's approach gives the best result.

@njit
def gen_mask_3(size, p=0.75):
    mask = np.random.rand(size)
    mask[mask>p]=0
    mask[mask!=0]=1
    return mask

def gen_mask_4(size, p=0.75):
    return (np.random.rand(size) < p).astype(int)

def gen_mask_5(size):
    return np.random.choice([0, 1, 1, 1], size=size)

def gen_mask_6(size, p=0.75):
    return (mkl_random.rand(size) < p).astype(int)

def gen_mask_7(size):
    return mkl_random.choice([0, 1, 1, 1], size=size)

%timeit gen_mask_4(size=2048)
%timeit gen_mask_5(size=2048)
%timeit gen_mask_6(size=2048)
%timeit gen_mask_7(size=2048)

22.2 µs ± 145 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
25.8 µs ± 336 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
7.64 µs ± 133 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
29.6 µs ± 1.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
7
  • Probably not using numpy random number generators. Perhaps by using MKL directly (software.intel.com/en-us/forums/intel-math-kernel-library/topic/…) Commented Dec 13, 2018 at 12:46
  • @MatthieuBrucher I am open to alternatives, the mask needs to be a np.array at the end, so I will need to do the conversion anyway. But if you know a faster way of generating random digits in python please share it. Commented Dec 13, 2018 at 12:48
  • It might be faster, but nothing for certain. MKL has routines for this (software.intel.com/en-us/…), but there is no example to create the state and call the function from Python. Commented Dec 13, 2018 at 12:50
  • 3
    This version of gen_mask_2 runs faster than either of your atempts (np.random.rand(size) < p).astype(int) Commented Dec 13, 2018 at 13:15
  • 1
    @nxpnsv your solution is the fastest so far, please add it as an answer. I can't get numba working though. Still trying that. Commented Dec 13, 2018 at 13:46

3 Answers 3

2

You can make use of Numba compiler and make things faster by applying njit decorator on your functions. Below is an example for a very large size

from numba import njit

def gen_mask_1(size, p=0.75):
    return np.random.binomial(1, p, size)

@njit(parallel=True)
def gen_mask_2(size, p=0.75):
    mask = np.random.rand(size)
    mask[mask>p]=0
    mask[mask!=0]=1
    return mask

%timeit gen_mask_1(size=100000)
%timeit gen_mask_2(size=100000)

2.33 ms ± 215 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
512 µs ± 25.1 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
Sign up to request clarification or add additional context in comments.

Comments

2

Another option is numpy.random.choice, with an input of 0s and 1s where the proportion of 1s is p. For example, for p = 0.75, use np.random.choice([0, 1, 1, 1], size=n):

In [303]: np.random.choice([0, 1, 1, 1], size=16)
Out[303]: array([1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0])

This is faster than using np.random.binomial:

In [304]: %timeit np.random.choice([0, 1, 1, 1], size=10000)
71.8 µs ± 368 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [305]: %timeit np.random.binomial(1, 0.75, 10000)
174 µs ± 348 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

To handle an arbitrary value for p, you can use the p option of np.random.choice, but then the code is slower than np.random.binomial:

In [308]: np.random.choice([0, 1], p=[0.25, 0.75], size=16)
Out[308]: array([1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0])

In [309]: %timeit np.random.choice([0, 1], p=[0.25, 0.75], size=10000)
227 µs ± 781 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

1 Comment

I wonder if the hybrid of your answer and mine would be even faster. i.e., using np.random.choice and applying njit decorator
1

As I said in the comment the question the implementation

def gen_mask_2(size, p=0.75):
    mask = np.random.rand(size)
    mask[mask>p]=0
    mask[mask!=0]=1
    return mask

can be improved, by using that comparison gives an bool which then can be converted to int. This removes the two comparisons with masked assignments you otherwise have, and it makes for a pretty one liner :)

def gen_mask_2(size, p=0.75):
    return = (np.random.rand(size) < p).astype(int)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.