build numpy d-dim array from iterator of (d-1)-dim array

Question

I have a use case, and I simplify it to following question:

import numpy as np

def get_matrix(i): # get a matrix N * M
    return (
        (i, i + 1, i + 1.2),
        (i + 1, i / 2, i * 3.2),
        (i / 3, i * 2, i / 4),
        (i / 5, i * 2.1, i + 2.2),
    )

K = 10000
# build a n-d array K * N * M
arr = np.array(
    tuple(get_matrix(i) for i in range(K)), 
    np.float32,
)

However, when I want to get K*N*M numpy array, I need to create a temporary tuple with shape K*N*M. Only when numpy array has been built, the tuple can be garbage collected. Therefore above construction has extra space O(K*N*M).

If I can create the numpy array from iterator (get_matrix(i) for i in range(K)), then every matrix N*M can be garbage collected, when it has been used. Therefore the extra space is O(N*M).

I found there is a method numpy.fromiter(), but I don't know how to write the dtype, since there is a similar example in the last.

import numpy as np

K = 10000
# build a n-d array K * N * M
arr = np.fromiter(
    (get_matrix(i) for i in range(K)), 
    dtype=np.float32, # there is error
)

juanpa.arrivillaga · Accepted Answer · 2022-11-07 18:59:56Z

1

Ah, so this is a new feature for np.fromiter. Just going by the example in the docs, the following worked:

K = 10000
N = 4
M = 3

# build a n-d array K * N * M
arr = np.fromiter(
    (get_matrix(i) for i in range(K)), 
    dtype=np.dtype((np.float32, (N, M))),
    count=K
)

Note, I used the count argument for good measure, but it works without it.

answered Nov 7, 2022 at 18:59

juanpa.arrivillaga

97.6k14 gold badges141 silver badges190 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

build numpy d-dim array from iterator of (d-1)-dim array

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related