1

I have a 1 dimensional numpy ndarray V. The operation I have is H = (V/a)**b, where a and b are just scalars. This is simple, but I need to run multiple experiments (thousands) where I try different variations of a's and b's. Its easy to do this with for loops, but I want this to run as fast as possible so I want to vectorize. So, lets say I have generated 1D ndarrays of a and b as well. given the 3 1D arrays V, a, and b, how do I go about doing this efficiently? Thanks!

1
  • Do you want to try with multiple pairs of values for a and b (like (a1, b1), (a2, b2), etc) or to have N possibilities for a, M possibilities for b and try every combination (N*M in total)? Commented Sep 26, 2017 at 12:06

2 Answers 2

1

You can do something like this:

import numpy as np

V = np.arange(100.0)
a = np.random.rand(50)
b = np.random.rand(50)

result = (V[np.newaxis, :] / a[:, np.newaxis]) ** b[:, np.newaxis]

result will have size 50x100, so result[i] will be the result of using the parameters a[i] and b[i].

Sign up to request clarification or add additional context in comments.

1 Comment

awesome. Thanks @jdehesa. I was not aware of the use of np.newaxis before. This SO question is a good primer - stackoverflow.com/questions/29241056/… For those curious, the formula I'm using is the cumulative hazard function for a Weibull model in survival analysis. V is a time vector, a and b are fitting parameters.
0

That's where broadcasting comes in, you can add dimensions so the arrays broadcast along each axis:

(V/a[:, None])**b[:, None, None]

This adds one trailing dimension to a and two trailing dimensions to b and broadcasting will make sure you'll get a 3D array. That works because broadcasting will always "broadcast" along the last axis.

For example:

>>> import numpy as np
>>> V = np.array([1,2,3])
>>> a = np.array([0.1, 0.2, 0.3])
>>> b = np.array([1, 0.5, 0.34])

>>> a[:, None]
array([[ 0.1],
       [ 0.2],
       [ 0.3]])

>>> b[:, None, None]
array([[[ 1.  ]],

       [[ 0.5 ]],

       [[ 0.34]]])

>>> (V/a[:, None])**b[:, None, None]
array([[[ 10.        ,  20.        ,  30.        ],
        [  5.        ,  10.        ,  15.        ],
        [  3.33333333,   6.66666667,  10.        ]],

       [[  3.16227766,   4.47213595,   5.47722558],
        [  2.23606798,   3.16227766,   3.87298335],
        [  1.82574186,   2.5819889 ,   3.16227766]],

       [[  2.18776162,   2.7691737 ,   3.17849276],
        [  1.72842206,   2.18776162,   2.51114059],
        [  1.50583981,   1.90602666,   2.18776162]]])

Note that this may take a lot of memory. So it may not be possible to use this if all your arrays are very long.

For example if all arrays contain 1000 values the result will contain 1 billion elements which takes a lot of RAM (~7.5GB with float64s).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.