5

I am using Numba a lot to speed up many loops which cannot be vectorised and would otherwise be very slow. My questions are:

  1. Can a Numba function create numpy arrays? I haven't found a way: functions like np.zeros don't work in Numba. What I do now is create empty arrays (initialised with zeros or NaNs) outside of Numba and passing them to my Numba function, which then fills them based on the calculation of my loop.

    1. Can Numba do a deep copy of a numpy array? I often have to work on many arrays of the same size. Numba can run array2 = array1 , but array2 becomes a reference to array1 (changing one changes the other).
    2. When I have non-Numba functions with many inputs and many outputs, I like to create classes with no methods for inputs and outputs. This way I can run something like:

    myinput.input_1= foo1

    myinput.input_2 = foo2

    myoutput = myfunction(myinput)

which is convenient when I have 20 inputs and 20 outputs. Can Numba support anything like this?

1 Answer 1

4

Numba is under active development, so the answer to your question depends on the version. In Numba >0.19, you gain the ability to create numpy arrays in nopython mode. All supported numpy constructs are listed at:

http://numba.pydata.org/numba-doc/0.20.0/reference/numpysupported.html

arr.copy() is also supported in nopython mode at least in 0.20 (where I checked).

In terms of passing in an object containing arrays as attributes, you can do this in object mode nopython=False, but it won't work in nopython mode. You'll have to then check what sort of speed-ups you get. Numba may be able to do some subsequent loop-lifting in that case.

My recommendation is, if possible, to stay up-to-date with the Numba releases. They are adding a lot of features and in my experience, fixing a lot of bugs as well.

Sign up to request clarification or add additional context in comments.

3 Comments

Can I pass a dataframe to Numba in nopython mode, so that Numba can then access dataframe.field1, dataframe.field2, etc? I would normally test this myself, but I don't have admin rights on my machine: installing a new version, tetsing it, and maybe rolling back to the previous one if anything is incompatible with my previous code is not the most straightforward process. Thanks!
Numba, I'm pretty sure doesn't natively handle Pandas DataFrames in nopython mode. My recommendation vis-a-vis updating Numba is (if possible), use the Anaconda Python Distribution (store.continuum.io/cshop/anaconda), it's free and does not require admin privileges to install and will allow you to manage package versions via conda. Most people I know working in data science/scientific computing use it, and I highly recommend it.
That's the one I'm using, but admin rights are still needed to access and download the remote repositories. It's to do more with the network setup than with my specific PC

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.