5

I'm doing a lot of vector algebra and want to use numpy arrays to remove any need for loops and run faster.

What I've found is that if I have a matrix A of size [N,P] I constantly need to use np.array([A[:,0]).T to force A[:,0] to be a column vector of size (N,1)

Is there a way to keep the single row or column of a 2D array as a 2D array because it makes the following arithmetic sooo much easier. For example, I often have to multiply a column vector (from a matrix) with a row vector (also created from a matrix) to create a new matrix: eg

C = A[:,i] * B[j,:]

it'd be be great if I didn't have to keep using:

C = np.array([A[:,i]]).T * np.array([B[j,:]])

It really obfuscates the code - in MATLAB it'd simply be C = A[:,i] * B[j,:] which is easier to read and compare with the underlying mathematics, especially if there's a lot of terms like this in the same line, but unfortunately most of my colleagues don't have MATLAB licenses.

Note this isn't the only use case, so a specific function for this column x row operation isn't too helpful

6
  • 1
    Have you considered using Octave? Commented Dec 16, 2021 at 13:02
  • Just in general my organisation uses Python quite heavily so it's best if I can stick to Python Commented Dec 16, 2021 at 13:58
  • Also, a minimal reproducable example will include many individual instances where Python changes a matrix to a 1D array Commented Dec 16, 2021 at 14:01
  • 1
    You are misusing the term matrix. A 2-D array is not a matrix in numpy. It's an array and remains an array when slices are selected. Your question seems to be: Can I use MATLAB syntax with numpy? The answer is: No, you can't, you actually have to learn numpy to use numpy. Commented Dec 16, 2021 at 14:14
  • 1
    Yes, I'm using matrix in the mathematical sense, where a 2D array is a matrix. The question is, can I use less clunky numpy syntax in numpy? And MATLAB is a nice example of less clunky syntax. Commented Dec 16, 2021 at 15:06

5 Answers 5

2

You can keep the dimension in the following way (using @ for matrix multiplication)

C = A[:,[i]] @ B[[j],:]

Notice the brackets around i and j, otherwise C won't be a 2-dimensional matrix.

Sign up to request clarification or add additional context in comments.

Comments

1

Even MATLAB/Octave squeezes out excess dimensions:

>> ones(2,3,4)(:,:,1)
ans = 
   1   1   1
   1   1   1
>> size(ones(2,3,4)(1,:))       # some indexing "flattens" outer dims
ans =
    1   12

When I started MATLAB v3.5 2d matrix was all it had; cells, struct and higher dimensions were later additions (as demonstrated by the above examples).

Your:

In [760]: A=np.arange(6).reshape(2,3)  
In [762]: np.array([A[:,0]]).T
Out[762]: 
array([[0],
       [3]])

is more convoluted than needed. It makes a list, then a (1,N) array from that, and finally a (N,1)

A[:,[0]], A[:,:,None], A[:,0:1] are more direct. Even A[:,0].reshape(-1,1)

I can't think of something simple that treats a scalar and list index the same.

Functions like np.atleast_2d can conditionally add a new dimension, but it will be a leading (outer) one. But by the rules of broadcasting leading dimensions are usually 'automatic'.

basic v advanced indexing

In the underlying Python, scalars can't be indexed, and lists can only be indexed with scalars and slices. The underlying syntax allows indexing with tuples, but lists reject those. It's numpy that has extended the indexing considerably - not with syntax but with how it handles those tuples.

numpy indexing with slices and scalars is basic indexing. That's where the dimension loss can occur. That's consistent with list indexing

In [768]: [[1,2,3],[4,5,6]][1]
Out[768]: [4, 5, 6]
In [769]: np.array([[1,2,3],[4,5,6]])[1]
Out[769]: array([4, 5, 6])

Indexing with lists and arrays is advanced indexing, without any list counterparts. This is perhaps where the differences between MATLAB and numpy are ugliest :)

 >> A([1,2],[1,2])

produces a (2,2) block. In numpy that produces a "diagonal"

In [781]: A[[0,1],[0,1]]
Out[781]: array([0, 4])

To get the block we have to use lists (or arrays) that "broadcast" against each other:

In [782]: A[[[0],[1]],[0,1]]      
Out[782]: 
array([[0, 1],
       [3, 4]])

To get the "diagonal" in MATLAB we have to use sub2ind([2,2],[1,2],[1,2]) to get the [1,4] flat indices.

What kind of multiplication?

In

np.array([A[:,i]]).T * np.array([B[j,:]])  

is this elementwise (.*) or matrix?

For a (N,1) and (1,M) pair, A*B and A@B produce the same (N,M) result, but one uses broadcasting to generalize the outer product, and the other is inner/matrix product (with sum-of-products).

Comments

0

https://numpy.org/doc/stable/reference/generated/numpy.matrix.html

Returns a matrix from an array-like object, or from a string of data. A matrix is a specialized 2-D array that retains its 2-D nature through operations. It has certain special operators, such as * (matrix multiplication) and ** (matrix power).

I'm not sure how to re-implement it though, it's an interesting exercise.

As mentionned, matrix will be deprecated. But from np.array, you can specify the dimension with the argument ndim=2:

np.array([1, 2, 3], ndmin=2)

5 Comments

Did you read the note? Using np.matrix is no longer recommended.
@MichaelSzczesny Had to read twice and scroll down and read it twice again to see it, sorry!
@MichaelSzczesny I edited my answer to add an alternative with np.array
np.matrix was intended for wayward MATLAB users, - such as this. The OP has a limited understanding of the differences.
Indeed, I'd love something that was MATLAB like where all data is treated the same In Python it seems a = 12 and a = [12] are entirely different beasts even though the underlying data is the same so any function you wish to carry out on them is entirely unambiguous eg find the length or test if it's greater than 3 - in Python you have to use different functions... I run into this all the time and write lots of awkward code rather than Python being able to handle it in a MATLAB-like way What if i and j are vectors (of the same size), will C = A[:, [i]] * B[[j],:] work? Who knows...
0

I like joostblack's answer from a readability standpoint. However from a speed standpoint, numpy's new axis is quite a bit faster, which might make a difference depending on use case.

import numpy as np
A = np.array([[1,2,3], [4,5,6]])
B = A.T
A[np.newaxis, 0, :] @ B[:,np.newaxis, 1]
python -m timeit -s "import numpy as np; a=np.array([[1,2,3],[4,5,6]]); b=a.T" "a[np.newaxis, 0, :] @ b[:, np.newaxis, 1]"

100000 loops, best of 5: 2.09 usec per loop

python -m timeit -s "import numpy as np; a=np.array([[1,2,3],[4,5,6]]); b=a.T" "a[[0], :] @ b[:, [1]]"

50000 loops, best of 5: 5.97 usec per loop

Comments

0

There is a function to do exactly what you want:

 np.outer(A[:,i], B[j,:])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.