create numpy array in for loop without usage of concatenation

Question

i am creating a simulation with multiple for-loops. My goal is to create a numpy array with all the values. I first used numpy.concatenate, since this does the job. I read, though, that np.concatenate is very slow so i am looking for a faster method on how to create the array with the values My code:

import numpy as np
values = np.array([])
for n in [100,1000]:
    for m in [2,10,100]:
        for roh in [0.0,0.5,0.9]:
            values = np.concatenate((values,[n,m,roh,1,2]))

The values 1,2 are just sample values and not important for this question. So is there a faster / smarter way to create a numpy array with all the permutations of the triple for loop?

Peter Meisrimel · Accepted Answer · 2020-07-22 15:17:35Z

2

itertools would be very convenient for this. E.g.

(modified this answer, also addressing follow up questions)

import numpy as np
import itertools
n_list, m_list, rho_list = [100,1000], [2,10,100], [0.0,0.5,0.9]

f1, f2 = lambda mat: 1, lambda mat: 2 # change accordingly

def f(n, m, rho, f1, f2):
    distance_matrix = np.zeros((2, 2))
    ## replace accordingly
    return f1(distance_matrix), f2(distance_matrix)

ff = lambda n, m, rho: f(n, m, rho, f1, f2) 
values = np.array([[n, m, rho, *ff(n, m, rho)] for n, m, rho in itertools.product(n_list, m_list, rho_list)])

The *ff(...) is to unpack the tuple into two separate values.

edited Jul 22, 2020 at 15:17

answered Jul 22, 2020 at 13:23

Peter Meisrimel

4123 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

James Over a year ago

the values 1 and 2 i had in my question will be calculated according to n,m,roh. Lets say these values will be stored in the variables "value1" and "value2". Will i be able to implement these values in your answer aswell?

Peter Meisrimel Over a year ago

Sure, given your two functions f1 and f2 for computing these values, you do

values = np.array([[n, m, rho, f1(n, m, rho), f2(n, m, rho)] for n, m, rho in itertools.product(n_list, m_list, rho_list)])

James Over a year ago

I tested your and Michael Sidorovs answer out and posted my test results as an answer. I still upvoted your answer as the one solving my problem

James Over a year ago

Let's say per n,m,roh iteration i want to create multiple values-arrays like you showed but the functions, in the array, f1() and f2() for all these values-arrays should depend on a distance_matrix (created with each n,m,roh iteration). Is this possible ? So to clarify i have 18 n,m,roh permutations, meaning 18 different distance_matrices for which there will be multiple values-arrays with functions f1(distance_matrix) and f2(distance_matrix). I hope my question is clear

Peter Meisrimel Over a year ago

You could merge f1 and f2 into a single function f, in which you assemble the distance matrix and then return the tuple f1(distance_matrix), f2(distance_matrix).

|

Michael · Accepted Answer · 2020-07-22 13:45:30Z

1

You can do in the following way:

np.array([[n,m,roh,1,2] for n in [100,1000] for m in [2,10,100] for roh in [0.0,0.5,0.9]])

Cheers.

answered Jul 22, 2020 at 13:45

Michael

2,3896 gold badges27 silver badges45 bronze badges

Comments

James · Accepted Answer · 2020-07-22 14:15:35Z

Okay thanks to "Peter Meisrimel" and "Michael Sidorov" for their answers. I myself tried, creating an empty array first, and then inserting the values in the empty array. I also implemented both answers and tested the performance of all (including the concatenation). To see the performance differences better i increased the size of my n, m, roh substantially. This is my code:

import numpy as np
import itertools
import time

rows = np.arange(40)
columns = np.arange(40)
rohs = np.arange(40)
num1 = np.array([1])
num2 = np.array([2])

start1 = time.time()
asd1 = np.empty((64000,5),dtype=np.float32)
iterator = 0
for a in rows:
    for s in columns:
        for d in rohs:
            asd1[iterator] = a,s,d,1,2
            iterator += 1
end1 = time.time()
print("time with empty array: "+str(end1-start1))

start2 = time.time()
asd2 = np.array([a for a in itertools.product(rows,columns,rohs,num1,num2)])
end2 = time.time()
print("time with itertools: "+str(end2-start2))

start3 = time.time()
asd3 = np.array([[n,m,roh,1,2] for n in rows for m in columns for roh in rohs])
end3 = time.time()
print("time with normal np array: "+str(end3-start3))

start4 = time.time()
asd4 = np.array([])
for i in rows:
    for j in columns:
        for k in rohs:
            asd4 = np.concatenate((asd4,[i,j,k,1,2]))
end4 = time.time()
print("time with concatenation: "+str(end4-start4))

As i ran this code i got the following Output:

time with empty array: 0.203110933303833
time with itertools: 0.14061522483825684
time with normal np array: 0.15624260902404785
time with concatenation: 49.35244131088257

So concluding, both given answers were substantially faster than the concatenation method, while using an empty array was relatively fast, using itertools proved to be the fastest way approaching this problem, the difference increasing with larger amounts of permutations.

Collectives™ on Stack Overflow

create numpy array in for loop without usage of concatenation

3 Answers 3

6 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

6 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related