0

I'm doing a little experiment with numpy arrays and I've come across the following problem. I'm trying to find a way to map a function that takes as input arrays over matrices, so that the function is applied to pair-wise elements of two or more matrices where those elements are arrays.

import numpy as np

x = np.random.random_integers(100, size=(5,4))
y = np.random.random_integers(100, size=(5,4))

print(x); print(y)
[[17 84 60 56]
 [58 71 50 90]
 [80 25 43 55]
 [18 25 77 25]
 [62 49 42 11]]
[[ 9 51 83 58]
 [34 63 26 32]
 [27 54 63 80]
 [29 42 10  6]
 [53 52 45 87]]


# np.dot(x,y) fails

v = np.vectorize(np.dot)
z = v(x,y)

print(z)
[[ 153 4284 4980 3248]
 [1972 4473 1300 2880]
 [2160 1350 2709 4400]
 [ 522 1050  770  150]
 [3286 2548 1890  957]]    # this is wrong

np.sum(z[0]) == np.dot(x[0], y[0])  # prints True
# the vectorized dot function was applied over individual elements at the "bottom-most"
# (second) dimension when instead it should be applied to the array elements
# at the first dimension

# I could instead use list comprehension
z = [np.dot(a, b) for a, b in zip(x,y)]

print(z)
[12665, 10625, 10619, 2492, 8681]    
# this would be a correct mapping of the dot function over the matrices x and y

The problem with list comprehensions is that I'm afraid that they're inefficient since they're a Python feature and not a Numpy feature.

1
  • 1
    vectorize is a Python level loop; also it feeds scalars to your function, not arrays. I know it's tempting to use it without reading its docs, but it'll save you some false starts. Commented May 12, 2017 at 2:55

2 Answers 2

2

You can multiply x and y and then sum the results by rows:

(x * y).sum(1)
# array([12665, 10625, 10619,  2492,  8681])

Or use numpy.einsum:

np.einsum("ij,ij->i", x, y)
# array([12665, 10625, 10619,  2492,  8681])
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you, I like this because it uses only numpy operations and so I know it's faster than if python operations were involved like list comprehension. but what if I had some other function that I want to map over some arbitrary dimension? Is there a general way to do that like vectorization?
1

If the function is arbitary, then you may have to use the list comprehension, or some equivalent iteration

z = [f(row_x, row_y) for row_x, row_y in zip(x,y)]

where x and y are (n,m) arrays, and z will be (n,?).

This iteration works on the first dimension of x and y, treating the arrays as lists of arrays.

As you found vectorize does not do what you want, because it works element wise, that is it passes scalars, not rows to the function.

As the other answer show, it is easy to express the dot product as something that works row-by-row. Where possible take that route. Look at the components of your calculation and ask which ones operate element by element, which operate row by row, etc. Many basic math functions work that way. But there are some operations that only work on 1d arrays, for example unique and in1d.

Before trying to come up with something that works for an arbitrary function, learn how to work with the multidimensional arrays in simpler cases. You'll get a lot further that way.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.