NumPy List Comprehension Syntax

Question

I'd like to be able to use list comprehension syntax to work with NumPy arrays easily.

For instance, I would like something like the below obviously wrong code to just reproduce the same array.

>>> X = np.random.randn(8,4)
>>> [[X[i,j] for i in X] for j in X[i]]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: arrays used as indices must be of integer (or boolean) type

What is the easy way to do this, to avoid using range(len(X)?

Don't do this! It defeats the entire purpose of using NumPy. — user2357112
– user2357112, Commented Jan 26, 2014 at 5:25
And (even if you fix it), this won't reproduce the array, it will produce a list of lists instead. — abarnert
– abarnert, Commented Jan 26, 2014 at 5:26
X[i,j] is not syntactic sugar. For x[i,j] the value is retrieved directly. x[i][j] get's the i'th row, and then the j'th element of that row. So it takes more time. — M4rtini
– M4rtini, Commented Jan 26, 2014 at 11:31
Are you trying to do something with the entries, such as square them? The question might make more sense in that case. — Walter Nissen
– Walter Nissen, Commented Apr 27, 2017 at 20:32

Foreever · Accepted Answer · 2019-04-21 15:03:13Z

First, you should not be using NumPy arrays as lists of lists.

Second, let's forget about NumPy; your listcomp doesn't make any sense in the first place, even for lists of lists.

In the inner comprehension, for i in X is going to iterate over the rows in X. Those rows aren't numbers, they're lists (or, in NumPy, 1D arrays), so X[i] makes no sense whatsoever. You may have wanted i[j] instead.

In the outer comprehension, for j in X[i] has the same problem, but is has an even bigger problem: there is no i value. You have a comprehension looping over each i inside this comprehension.

If you're confused by a comprehension, write it out as an explicit for statement, as explained in the tutorial section on List Comprehensions:

tmp = []
for j in X[i]:
    tmp.append([X[i,j] for i in X])

… which expands to:

tmp = []
for j in X[i]:
    tmp2 = []
    for i in X:
        tmp2.append(X[i,j])
    tmp.append(tmp2)

… which should make it obvious what's wrong here.

I think what you wanted was:

[[cell for cell in row] for row in X]

Again, turn it back into explicit for statements:

tmp = []
for row in X;
    tmp2 = []
    for cell in row:
        tmp2.append(cell)
    tmp.append(tmp2)

That's obviously right.

Or, if you really want to use indexing (but you don't):

[[X[i][j] for j in range(len(X[i]))] for i in range(len(X))]

So, back to NumPy. In NumPy terms, that last version is:

[[X[i,j] for j in range(X.shape[1])] for i in range(X.shape[0])]

… and if you want to go in column-major order instead of row-major, you can (unlike with a list of lists):

[[X[i,j] for i in range(X.shape[0])] for j in range(X.shape[1])]

… but that will of course transpose the array, which isn't what you wanted to do.

The one thing you can't do is mix up column-major and row-major order in the same expression, because you end up with nonsense.

Of course the right way to make a copy of an array is to use the copy method:

X.copy()

Just as the right way to transpose an array is:

X.T

user2357112 · Accepted Answer · 2014-01-26 05:32:01Z

16

The easy way is to not do this. Use numpy's implicit vectorization instead. For example, if you have arrays A and B as follows:

A = numpy.array([[1, 3, 5],
                 [2, 4, 6],
                 [9, 8, 7]])
B = numpy.array([[5, 3, 5],
                 [3, 5, 3],
                 [5, 3, 5]])

then the following code using list comprehensions:

C = numpy.array([[A[i, j] * B[i, j] for j in xrange(A.shape[1])]
                 for i in xrange(A.shape[0])])

can be much more easily written as

C = A * B

It'll also run much faster. Generally, you will produce faster, clearer code if you don't use list comprehensions with numpy than if you do.

If you really want to use list comprehensions, standard Python list-comprehension-writing techniques apply. Iterate over the elements, not the indices:

C = numpy.array([[a*b for a, b in zip(a_row, b_row)]
                 for a_row, b_row in zip(A, B)]

Thus, your example code would become

numpy.array([[elem for elem in x_row] for x_row in X])

answered Jan 26, 2014 at 5:32

user2357112

286k32 gold badges490 silver badges571 bronze badges

3 Comments

Ricardo Cruz Over a year ago

Note: for efficiency, if you will not use the list any longer, you might want to use np.asarray() instead of np.array(). It's the same function with copy=False.

user2357112 Over a year ago

@RicardoCruz: It's going to copy either way. asarray can only avoid a copy if the input is already an array.

Ricardo Cruz Over a year ago

@user237112, Oh, thanks for pointing that out. I didn't know that.

exogeographer · Accepted Answer · 2014-01-26 15:20:21Z

5

Another option (though not necessarily performant) is to rethink your problem as a map instead of a comprehension and write a ufunc:

http://docs.scipy.org/doc/numpy/reference/ufuncs.html

You can call functional-lite routines like:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.apply_over_axes.html http://docs.scipy.org/doc/numpy/reference/generated/numpy.vectorize.html

Etc.

answered Jan 26, 2014 at 15:20

exogeographer

3591 silver badge5 bronze badges

Comments

falsetru · Accepted Answer · 2014-01-26 05:35:50Z

2

Do you mean following?

>>> [[X[i,j] for j in range(X.shape[1])] for i in range(X.shape[0])]
[[0.62757350000000001, -0.64486080999999995, -0.18372566000000001, 0.78470704000000002],
 [1.78209799, -1.336448459999999 9, -1.3851422200000001, -0.49668994],
 [-0.84148266000000005, 0.18864597999999999, -1.1135151299999999, -0.40225053999999 999],
 [0.93852824999999995, 0.24652238000000001, 1.1481637499999999, -0.70346624999999996],
 [0.83842508000000004, 1.0058 697599999999, -0.91267403000000002, 0.97991269000000003],
 [-1.4265273000000001, -0.73465904999999998, 0.6684284999999999 8, -0.21551155],
 [-1.1115614599999999, -1.0035033200000001, -0.11558254, -0.4339924],
 [1.8771354, -1.0189299199999999, - 0.84754008000000003, -0.35387946999999997]]

Using numpy.ndarray.copy:

>>> X.copy()
array([[ 0.6275735 , -0.64486081, -0.18372566,  0.78470704],
       [ 1.78209799, -1.33644846, -1.38514222, -0.49668994],
       [-0.84148266,  0.18864598, -1.11351513, -0.40225054],
       [ 0.93852825,  0.24652238,  1.14816375, -0.70346625],
       [ 0.83842508,  1.00586976, -0.91267403,  0.97991269],
       [-1.4265273 , -0.73465905,  0.6684285 , -0.21551155],
       [-1.11156146, -1.00350332, -0.11558254, -0.4339924 ],
       [ 1.8771354 , -1.01892992, -0.84754008, -0.35387947]])

edited Jan 26, 2014 at 5:35

answered Jan 26, 2014 at 5:24

falsetru

371k69 gold badges769 silver badges659 bronze badges

1 Comment

abarnert Over a year ago

He says he's trying to produce the exact same shape he started with, so it's not even transpose, it's just copy.

Collectives™ on Stack Overflow

NumPy List Comprehension Syntax

4 Answers 4

Comments

3 Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

3 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related