1

I need a way to iterate over each element in a numpy array of any shape, and store its index in a list.

This code produces arrays of general shapes:

import numpy as np

# Generate random shape for the array.
sha = np.random.randint(1, 5, size=(np.random.randint(1, 10)))
# Random array.
a =np.random.rand(*sha)

I need to iterate over each element in a and store its index in a list.

The closest I've gotten is by flattening the array:

for i, elem in enumerate(a.flatten()):
    print i, elem

which allows me to iterate over every element in a no matter its shape, but I loose the indexes where each element is stored in the array.

3 Answers 3

1

You can use itertools to create a product (Cartesian) of all dimensions

from itertools import product
for i in product(*[range(dim) for dim in a.shape]):
    print i, a[i]
Sign up to request clarification or add additional context in comments.

1 Comment

All great answers but I'm selecting this one since it does precisely what I asked for. Thank you all!
1

You're looking for numpy.unravel_index:

>>> np.dstack(np.unravel_index(np.arange(a.size), a.shape))
array([[[0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 1],
        [0, 0, 0, ..., 0, 0, 2],
        ...,
        [2, 2, 0, ..., 1, 1, 1],
        [2, 2, 0, ..., 1, 1, 2],
        [2, 2, 0, ..., 1, 1, 3]]])

Comments

1

ndindex generates the multidim index tuples for any array

In fact with this we can generate the nd array of dtype object, and fill it with some function of these indices in one step (ok, 2 steps):

shape=(3,4,2)
A = np.empty(shape, dtype=object)
for i in np.ndindex(3,4,2):
    A[i]=i   # or use A[i]=list(i) 

producing A:

array([[[(0, 0, 0), (0, 0, 1)],
        [(0, 1, 0), (0, 1, 1)],
        [(0, 2, 0), (0, 2, 1)],
        [(0, 3, 0), (0, 3, 1)]],

       [[(1, 0, 0), (1, 0, 1)],
        [(1, 1, 0), (1, 1, 1)],
        [(1, 2, 0), (1, 2, 1)],
        [(1, 3, 0), (1, 3, 1)]],

       [[(2, 0, 0), (2, 0, 1)],
        [(2, 1, 0), (2, 1, 1)],
        [(2, 2, 0), (2, 2, 1)],
        [(2, 3, 0), (2, 3, 1)]]], dtype=object)

I filled this with tuples rather than lists because the display is clearer. In

array([[[[0, 0, 0], [0, 0, 1]],
        [[0, 1, 0], [0, 1, 1]],
        [[0, 2, 0], [0, 2, 1]],
        ....

it is hard to distinguish between an array dimension and a list.

With dtype object, the individual elements could be anything. np.empty fills it with None.


A related data structure would be a flat list of these tuples. It could still be accessed as though it were nd by using np.ravel_multi_index.

L1 = [i for i in np.ndindex(shape)]

[(0, 0, 0),
 (0, 0, 1),
 (0, 1, 0),
 ...
 (2, 2, 1),
 (2, 3, 0),
 (2, 3, 1)]

Accessed as a simulated deeply nested list:

 L1[np.ravel_multi_index(i,shape)]

A few access time comparisons:

In [137]: %%timeit
for i in np.ndindex(shape):
    x=A[i]
10000 loops, best of 3: 88.3 µs per loop

In [138]: %%timeit
for i in np.ndindex(shape):
    x=L1[np.ravel_multi_index(i,shape)]
1000 loops, best of 3: 227 µs per loop

So multidimensional indexing of the array is faster.

In [140]: %%timeit
   .....: for i in L1:
   .....:     x=i
1000000 loops, best of 3: 774 ns per loop

In [143]: %%timeit
   .....: for i in A.flat:
    x=i
1000000 loops, best of 3: 1.44 µs per loop

But direct iteration over the list is much faster.


for the itertools.produce iterator:

In [163]: %%timeit
   .....: for i in itertools.product(*[range(x) for x in shape]):
    x=A[i]
   .....: 
100000 loops, best of 3: 12.1 µs per loop

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.