Given an array of points with shape (2, n) the function returns the n interpolated values. The function should also work if points is a 1-d array (2,). I think numba vectorize could be used to solve this elegantly but I havent gotten it to work.
from numba import njit, vectorize, float64
@njit
def bilinear_interpolation(points, matrix, axis_0_start, axis_0_step, axis_1_start, axis_1_step):
res = np.empty(points.shape[1])
for i in prange(points.shape[1]):
point0_loc = (points[0, i] - axis_0_start) / axis_0_step
point1_loc = (points[1, i] - axis_1_start) / axis_1_step
idx_0l = math.floor(point0_loc)
idx_0h = idx_0l + 1
idx_1l = math.floor(point1_loc)
idx_1h = idx_1l + 1
mat_hl = matrix[idx_0h, idx_1l]
mat_ll = matrix[idx_0l, idx_1l]
mat_hh = matrix[idx_0h, idx_1h]
mat_lh = matrix[idx_0l, idx_1h]
res[i] = (mat_ll * (idx_0h - point0_loc) * (idx_1h - point1_loc) +
mat_hl * (point0_loc - idx_0l) * (idx_1h - point1_loc) +
mat_lh * (idx_0h - point0_loc) * (point1_loc - idx_1l) +
mat_hh * (point0_loc - idx_0l) * (point1_loc - idx_1l))
return res
I have tried:
from numba import njit, vectorize, float64
@vectorize([float64(float64, float64, float64[:, :], float64, float64, float64, float64)])
def bilinear_interpolation(point0, point1, matrix, axis_0_start, axis_0_step, axis_1_start, axis_1_step):
point0_loc = (point0 - axis_0_start) / axis_0_step
point1_loc = (point1 - axis_1_start) / axis_1_step
idx_0l = math.floor(point0_loc)
idx_0h = idx_0l + 1
idx_1l = math.floor(point1_loc)
idx_1h = idx_1l + 1
mat_hl = matrix[idx_0h, idx_1l]
mat_ll = matrix[idx_0l, idx_1l]
mat_hh = matrix[idx_0h, idx_1h]
mat_lh = matrix[idx_0l, idx_1h]
res = (mat_ll * (idx_0h - point0_loc) * (idx_1h - point1_loc) +
mat_hl * (point0_loc - idx_0l) * (idx_1h - point1_loc) +
mat_lh * (idx_0h - point0_loc) * (point1_loc - idx_1l) +
mat_hh * (point0_loc - idx_0l) * (point1_loc - idx_1l))
return res
@guvectorizeis certainly the tool you are searching for. If it is not powerful enough then I do not think you can use Numba vectorization (vectorizeis more restricted).rescan be speed up by SIMD instruction though. Still, I expect the gather and the memory accesses to be a bottleneck. Thus, while you could get a performance improvement, it should be relatively small.