2

I want to apply arbitrary function to 3d-ndarray as element, which use (3rd-dimensional) array for its arguments and return scalar.As a result, we should get 2d-Matrix.

e.g) pseudo code

A = [[[1,2,3],[4,5,6]],
     [[7,8,9],[10,11,12]]]
A.apply_3d_array(sum) ## or apply_3d_array(A,sum) is Okey.
>> [[6,15],[24,33]]

I understand it's possible with loop using ndarray.shape function,but direct index access is inefficient as official document says. Is there more effective way than using loop?

def chromaticity(pixel):
    geo_mean = math.pow(sum(pixel),1/3)
    return map(lambda x: math.log(x/geo_mean),pixel ) 
6
  • 2
    Can you share the implementation of arbitrary function? For noticeable speedups, that might be the key here. Commented Sep 3, 2016 at 16:04
  • thank you for comment @Divakar. Arbitrary function means all function pass array arguments and return some value. Commented Sep 3, 2016 at 16:21
  • I know what you meant by Arbitrary function here. I meant if you are looking to speed up some specific function, could you share its implementation, because we can probably use NumPy ufuncs to vectorize the operations involved in that function. Commented Sep 3, 2016 at 16:26
  • Oh I'm sorry misunderstanding.I'm not native English speaker,so I may use some odd expression and have misinterpretation.In fact, I apply bellow function to each element array def chromaticity(pixel): geo_mean = math.pow(sum(pixel),1/3) return map(lambda x: math.log(x/geo_mean),pixel ) Commented Sep 3, 2016 at 16:49
  • Yes, that one! So, could you edit your question adding in that function implementation? Commented Sep 3, 2016 at 17:04

2 Answers 2

3

Given the function implementation, we could vectorize it using NumPy ufuncs that would operate on the entire input array A in one go and thus avoid the math library functions that doesn't support vectorization on arrays. In this process, we would also bring in the very efficient vectorizing tool : NumPy broadcasting. So, we would have an implementation like so -

np.log(A/np.power(np.sum(A,2,keepdims=True),1/3))

Sample run and verification

The function implementation without the lamdba construct and introducing NumPy functions instead of math library functions, would look something like this -

def chromaticity(pixel): 
    geo_mean = np.power(np.sum(pixel),1/3) 
    return np.log(pixel/geo_mean)

Sample run with the iterative implementation -

In [67]: chromaticity(A[0,0,:])
Out[67]: array([-0.59725316,  0.09589402,  0.50135913])

In [68]: chromaticity(A[0,1,:])
Out[68]: array([ 0.48361096,  0.70675451,  0.88907607])

In [69]: chromaticity(A[1,0,:])
Out[69]: array([ 0.88655887,  1.02009026,  1.1378733 ])

In [70]: chromaticity(A[1,1,:])
Out[70]: array([ 1.13708257,  1.23239275,  1.31940413])    

Sample run with the proposed vectorized implementation -

In [72]: np.log(A/np.power(np.sum(A,2,keepdims=True),1/3))
Out[72]: 
array([[[-0.59725316,  0.09589402,  0.50135913],
        [ 0.48361096,  0.70675451,  0.88907607]],

       [[ 0.88655887,  1.02009026,  1.1378733 ],
        [ 1.13708257,  1.23239275,  1.31940413]]])

Runtime test

In [131]: A = np.random.randint(0,255,(512,512,3)) # 512x512 colored image

In [132]: def org_app(A):
     ...:     out = np.zeros(A.shape)     
     ...:     for i in range(A.shape[0]):
     ...:         for j in range(A.shape[1]):
     ...:             out[i,j] = chromaticity(A[i,j])
     ...:     return out
     ...: 

In [133]: %timeit org_app(A)
1 loop, best of 3: 5.99 s per loop

In [134]: %timeit np.apply_along_axis(chromaticity, 2, A) #@hpaulj's soln
1 loop, best of 3: 9.68 s per loop

In [135]: %timeit np.log(A/np.power(np.sum(A,2,keepdims=True),1/3))
10 loops, best of 3: 90.8 ms per loop

That's why always try to push in NumPy funcs when vectorizing things with arrays and work on as many elements in one-go as possible!

Sign up to request clarification or add additional context in comments.

2 Comments

Great! It seems ideal solution as close as I wanted first .Thanks.
@tkowt Added some timing test results in it.
1

apply_along_axis is designed to make this task easy:

In [683]: A=np.arange(1,13).reshape(2,2,3)
In [684]: A
Out[684]: 
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])
In [685]: np.apply_along_axis(np.sum, 2, A)
Out[685]: 
array([[ 6, 15],
       [24, 33]])

It, in effect, does

for all i,j:
    out[i,j] = func( A[i,j,:])

taking care of the details. It's not faster than doing that iteration yourself, but it makes it easier.

Another trick is to reshape your input to 2d, perform the simpler 1d iteration, and the reshape the result

 A1 = A.reshape(-1, A.shape[-1])
 for i in range(A1.shape[0]):
     out[i] = func(A1[i,:])
 out.reshape(A.shape[:2])

To do things faster, you need to dig into the guts of the function, and figure out how to use compile numpy operations on more than one dimension. In the simple case of sum, that function already can work on selected axes.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.