1

I'm looking into using numba for speeding up iterative calculations, particularly in cases where the calculations sometimes rely on the results of previous calculations and thus vectorizing isn't always applicable. One of the things I have found lacking is it doesn't appear to allow dataframes. No problem though I thought, you can pass a 2D numpy array, and a numpy array of column names, and I attempted to implement a function to refer to values by their column name rather than index. Here's the code I have so far.

from numba import jit
import numpy as np
@jit(nopython=True)
def get_index(cols,col):
    for i in range(len(cols)):
        if cols[i] == col:
            return i
@jit(nopython=True)
def get_element(ndarr: np.ndarray,cols:np.ndarray,row:np.int8,name:str):
    ind = get_index(cols,name)
    print(row)
    print(ind)
    print(ndarr[0][0])
    #print(ndarr[row][ind])
get_element(np.array([['HI'],['BYE'],['HISAHASDG']]),np.array(['COLUMN_1']),0,"COLUMN_1")

I have get_index, which I've independently tested and it works. This is basically an implementation of np.where, which I was wondering whether that could have been driving my error. So with that print commented out, this code now runs. It prints out 0, 0, and then "HI" as expected. So in theory all the commented out line should do is print "HI", just as the previous line's print does, because both row and ind are 0. But when I uncomment it, I get the following:

---------------------------------------------------------------------------
TypingError                               Traceback (most recent call last)
<timed exec> in <module>

/sas/python/app/miniconda3/envs/py3lu/lib/python3.6/site-packages/numba/core/dispatcher.py in _compile_for_args(self, *args, **kws)
    399                 e.patch_message(msg)
    400 
--> 401             error_rewrite(e, 'typing')
    402         except errors.UnsupportedError as e:
    403             # Something unsupported is present in the user code, add help info

/sas/python/app/miniconda3/envs/py3lu/lib/python3.6/site-packages/numba/core/dispatcher.py in error_rewrite(e, issue_type)
    342                 raise e
    343             else:
--> 344                 reraise(type(e), e, None)
    345 
    346         argtypes = []

/sas/python/app/miniconda3/envs/py3lu/lib/python3.6/site-packages/numba/core/utils.py in reraise(tp, value, tb)
     78         value = tp()
     79     if value.__traceback__ is not tb:
---> 80         raise value.with_traceback(tb)
     81     raise value
     82 

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Invalid use of Function(<built-in function getitem>) with argument(s) of type(s): (array([unichr x 50], 1d, C), OptionalType(int64) i.e. the type 'int64 or None')
 * parameterized
In definition 0:
    All templates rejected with literals.
In definition 1:
    All templates rejected without literals.
In definition 2:
    All templates rejected with literals.
In definition 3:
    All templates rejected without literals.
In definition 4:
    All templates rejected with literals.
In definition 5:
    All templates rejected without literals.
In definition 6:
    All templates rejected with literals.
In definition 7:
    All templates rejected without literals.
In definition 8:
    All templates rejected with literals.
In definition 9:
    All templates rejected without literals.
In definition 10:
    All templates rejected with literals.
In definition 11:
    All templates rejected without literals.
In definition 12:
    TypeError: unsupported array index type OptionalType(int64) i.e. the type 'int64 or None' in [OptionalType(int64)]
    raised from /sas/python/app/miniconda3/envs/py3lu/lib/python3.6/site-packages/numba/core/typing/arraydecl.py:69
In definition 13:
    TypeError: unsupported array index type OptionalType(int64) i.e. the type 'int64 or None' in [OptionalType(int64)]
    raised from /sas/python/app/miniconda3/envs/py3lu/lib/python3.6/site-packages/numba/core/typing/arraydecl.py:69
In definition 14:
    All templates rejected with literals.
In definition 15:
    All templates rejected without literals.
This error is usually caused by passing an argument of a type that is unsupported by the named function.
[1] During: typing of intrinsic-call at <timed exec> (15)

File "<timed exec>", line 15:
<source missing, REPL/exec in use?>

Is there something I'm missing? I checked the types of both row and ind and they are indeed int types. Why is numba not letting me subset with int variables? Thanks.

1 Answer 1

2

numba is being really clever here! Consider what happens when you pass a col to get_index that isn't in cols. cols[i] == col will never be true, the loop will exit, and since there's no catchall return at the end of the function, the return value will be None.

numba therefore correctly infers that the return type of get_index is OptionalType(int64) i.e. a value that may either be int64 or None. But None isn't a valid type for indices, so you can't use a value that might be None to index an array.

You can fix this by adding a catchall return at the end.

@jit(nopython=True)
def get_index(cols, col):
    for i in range(len(cols)):
        if cols[i] == col:
            return i
    return -1

Of course this might not be the behavior you want in this case; it's probably better to raise an exception, which numba also handles correctly.

@jit(nopython=True)
def get_index(cols, col):
    for i in range(len(cols)):
        if cols[i] == col:
            return i
    raise IndexError('list index out of range')
Sign up to request clarification or add additional context in comments.

1 Comment

wow that's insane, I did see the optional part, but thought maybe that was just used for all int return values. Thank you so much, adding the return did the trick :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.