32

I'd like to be able to use list comprehension syntax to work with NumPy arrays easily.

For instance, I would like something like the below obviously wrong code to just reproduce the same array.

>>> X = np.random.randn(8,4)
>>> [[X[i,j] for i in X] for j in X[i]]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: arrays used as indices must be of integer (or boolean) type

What is the easy way to do this, to avoid using range(len(X)?

5
  • 2
    X[i,j] is syntactic sugar for X[i][j] in NumPy Commented Jan 26, 2014 at 5:24
  • 10
    Don't do this! It defeats the entire purpose of using NumPy. Commented Jan 26, 2014 at 5:25
  • And (even if you fix it), this won't reproduce the array, it will produce a list of lists instead. Commented Jan 26, 2014 at 5:26
  • 6
    X[i,j] is not syntactic sugar. For x[i,j] the value is retrieved directly. x[i][j] get's the i'th row, and then the j'th element of that row. So it takes more time. Commented Jan 26, 2014 at 11:31
  • Are you trying to do something with the entries, such as square them? The question might make more sense in that case. Commented Apr 27, 2017 at 20:32

4 Answers 4

37

First, you should not be using NumPy arrays as lists of lists.

Second, let's forget about NumPy; your listcomp doesn't make any sense in the first place, even for lists of lists.

In the inner comprehension, for i in X is going to iterate over the rows in X. Those rows aren't numbers, they're lists (or, in NumPy, 1D arrays), so X[i] makes no sense whatsoever. You may have wanted i[j] instead.

In the outer comprehension, for j in X[i] has the same problem, but is has an even bigger problem: there is no i value. You have a comprehension looping over each i inside this comprehension.

If you're confused by a comprehension, write it out as an explicit for statement, as explained in the tutorial section on List Comprehensions:

tmp = []
for j in X[i]:
    tmp.append([X[i,j] for i in X])

… which expands to:

tmp = []
for j in X[i]:
    tmp2 = []
    for i in X:
        tmp2.append(X[i,j])
    tmp.append(tmp2)

… which should make it obvious what's wrong here.


I think what you wanted was:

[[cell for cell in row] for row in X]

Again, turn it back into explicit for statements:

tmp = []
for row in X;
    tmp2 = []
    for cell in row:
        tmp2.append(cell)
    tmp.append(tmp2)

That's obviously right.

Or, if you really want to use indexing (but you don't):

[[X[i][j] for j in range(len(X[i]))] for i in range(len(X))]

So, back to NumPy. In NumPy terms, that last version is:

[[X[i,j] for j in range(X.shape[1])] for i in range(X.shape[0])]

… and if you want to go in column-major order instead of row-major, you can (unlike with a list of lists):

[[X[i,j] for i in range(X.shape[0])] for j in range(X.shape[1])]

… but that will of course transpose the array, which isn't what you wanted to do.

The one thing you can't do is mix up column-major and row-major order in the same expression, because you end up with nonsense.


Of course the right way to make a copy of an array is to use the copy method:

X.copy()

Just as the right way to transpose an array is:

X.T
Sign up to request clarification or add additional context in comments.

Comments

16

The easy way is to not do this. Use numpy's implicit vectorization instead. For example, if you have arrays A and B as follows:

A = numpy.array([[1, 3, 5],
                 [2, 4, 6],
                 [9, 8, 7]])
B = numpy.array([[5, 3, 5],
                 [3, 5, 3],
                 [5, 3, 5]])

then the following code using list comprehensions:

C = numpy.array([[A[i, j] * B[i, j] for j in xrange(A.shape[1])]
                 for i in xrange(A.shape[0])])

can be much more easily written as

C = A * B

It'll also run much faster. Generally, you will produce faster, clearer code if you don't use list comprehensions with numpy than if you do.

If you really want to use list comprehensions, standard Python list-comprehension-writing techniques apply. Iterate over the elements, not the indices:

C = numpy.array([[a*b for a, b in zip(a_row, b_row)]
                 for a_row, b_row in zip(A, B)]

Thus, your example code would become

numpy.array([[elem for elem in x_row] for x_row in X])

3 Comments

Note: for efficiency, if you will not use the list any longer, you might want to use np.asarray() instead of np.array(). It's the same function with copy=False.
@RicardoCruz: It's going to copy either way. asarray can only avoid a copy if the input is already an array.
@user237112, Oh, thanks for pointing that out. I didn't know that.
5

Another option (though not necessarily performant) is to rethink your problem as a map instead of a comprehension and write a ufunc:

http://docs.scipy.org/doc/numpy/reference/ufuncs.html

You can call functional-lite routines like:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.apply_over_axes.html http://docs.scipy.org/doc/numpy/reference/generated/numpy.vectorize.html

Etc.

Comments

2

Do you mean following?

>>> [[X[i,j] for j in range(X.shape[1])] for i in range(X.shape[0])]
[[0.62757350000000001, -0.64486080999999995, -0.18372566000000001, 0.78470704000000002],
 [1.78209799, -1.336448459999999 9, -1.3851422200000001, -0.49668994],
 [-0.84148266000000005, 0.18864597999999999, -1.1135151299999999, -0.40225053999999 999],
 [0.93852824999999995, 0.24652238000000001, 1.1481637499999999, -0.70346624999999996],
 [0.83842508000000004, 1.0058 697599999999, -0.91267403000000002, 0.97991269000000003],
 [-1.4265273000000001, -0.73465904999999998, 0.6684284999999999 8, -0.21551155],
 [-1.1115614599999999, -1.0035033200000001, -0.11558254, -0.4339924],
 [1.8771354, -1.0189299199999999, - 0.84754008000000003, -0.35387946999999997]]

Using numpy.ndarray.copy:

>>> X.copy()
array([[ 0.6275735 , -0.64486081, -0.18372566,  0.78470704],
       [ 1.78209799, -1.33644846, -1.38514222, -0.49668994],
       [-0.84148266,  0.18864598, -1.11351513, -0.40225054],
       [ 0.93852825,  0.24652238,  1.14816375, -0.70346625],
       [ 0.83842508,  1.00586976, -0.91267403,  0.97991269],
       [-1.4265273 , -0.73465905,  0.6684285 , -0.21551155],
       [-1.11156146, -1.00350332, -0.11558254, -0.4339924 ],
       [ 1.8771354 , -1.01892992, -0.84754008, -0.35387947]])

1 Comment

He says he's trying to produce the exact same shape he started with, so it's not even transpose, it's just copy.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.