I'm looking into using numba for speeding up iterative calculations, particularly in cases where the calculations sometimes rely on the results of previous calculations and thus vectorizing isn't always applicable. One of the things I have found lacking is it doesn't appear to allow dataframes. No problem though I thought, you can pass a 2D numpy array, and a numpy array of column names, and I attempted to implement a function to refer to values by their column name rather than index. Here's the code I have so far.
from numba import jit
import numpy as np
@jit(nopython=True)
def get_index(cols,col):
for i in range(len(cols)):
if cols[i] == col:
return i
@jit(nopython=True)
def get_element(ndarr: np.ndarray,cols:np.ndarray,row:np.int8,name:str):
ind = get_index(cols,name)
print(row)
print(ind)
print(ndarr[0][0])
#print(ndarr[row][ind])
get_element(np.array([['HI'],['BYE'],['HISAHASDG']]),np.array(['COLUMN_1']),0,"COLUMN_1")
I have get_index, which I've independently tested and it works. This is basically an implementation of np.where, which I was wondering whether that could have been driving my error. So with that print commented out, this code now runs. It prints out 0, 0, and then "HI" as expected. So in theory all the commented out line should do is print "HI", just as the previous line's print does, because both row and ind are 0. But when I uncomment it, I get the following:
---------------------------------------------------------------------------
TypingError Traceback (most recent call last)
<timed exec> in <module>
/sas/python/app/miniconda3/envs/py3lu/lib/python3.6/site-packages/numba/core/dispatcher.py in _compile_for_args(self, *args, **kws)
399 e.patch_message(msg)
400
--> 401 error_rewrite(e, 'typing')
402 except errors.UnsupportedError as e:
403 # Something unsupported is present in the user code, add help info
/sas/python/app/miniconda3/envs/py3lu/lib/python3.6/site-packages/numba/core/dispatcher.py in error_rewrite(e, issue_type)
342 raise e
343 else:
--> 344 reraise(type(e), e, None)
345
346 argtypes = []
/sas/python/app/miniconda3/envs/py3lu/lib/python3.6/site-packages/numba/core/utils.py in reraise(tp, value, tb)
78 value = tp()
79 if value.__traceback__ is not tb:
---> 80 raise value.with_traceback(tb)
81 raise value
82
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Invalid use of Function(<built-in function getitem>) with argument(s) of type(s): (array([unichr x 50], 1d, C), OptionalType(int64) i.e. the type 'int64 or None')
* parameterized
In definition 0:
All templates rejected with literals.
In definition 1:
All templates rejected without literals.
In definition 2:
All templates rejected with literals.
In definition 3:
All templates rejected without literals.
In definition 4:
All templates rejected with literals.
In definition 5:
All templates rejected without literals.
In definition 6:
All templates rejected with literals.
In definition 7:
All templates rejected without literals.
In definition 8:
All templates rejected with literals.
In definition 9:
All templates rejected without literals.
In definition 10:
All templates rejected with literals.
In definition 11:
All templates rejected without literals.
In definition 12:
TypeError: unsupported array index type OptionalType(int64) i.e. the type 'int64 or None' in [OptionalType(int64)]
raised from /sas/python/app/miniconda3/envs/py3lu/lib/python3.6/site-packages/numba/core/typing/arraydecl.py:69
In definition 13:
TypeError: unsupported array index type OptionalType(int64) i.e. the type 'int64 or None' in [OptionalType(int64)]
raised from /sas/python/app/miniconda3/envs/py3lu/lib/python3.6/site-packages/numba/core/typing/arraydecl.py:69
In definition 14:
All templates rejected with literals.
In definition 15:
All templates rejected without literals.
This error is usually caused by passing an argument of a type that is unsupported by the named function.
[1] During: typing of intrinsic-call at <timed exec> (15)
File "<timed exec>", line 15:
<source missing, REPL/exec in use?>
Is there something I'm missing? I checked the types of both row and ind and they are indeed int types. Why is numba not letting me subset with int variables? Thanks.