I have a (huge) 2D array. For example:
a=[[1,2],[2,3],[4,5]]
I need to extract from it the elements that satisfy certain conditions
a[:,0]>1 and a[:,1]>2
such that I get in return an array with only elements that satisfy both the conditions
[[2,3],[4,5]]
(I need to further use that in a loop, which might or might not be relevant to the question)
I have tried the following:
np.transpose([np.extract(a[:,0]>1,a[:,0]),np.extract(a[:,1]>2,a[:,1])])
The above works only when the both the extracted array are of same length. Even when it works, it sometimes returns pairs that weren't paired together to begin with (I understand why)
I know how to do it in lists
list(filter(lambda b: b[0]>1 and b[1]>2,a))
However, I want to improve the efficiency. So I am shifting towards numpy (since I've read it is generally more efficient?) Is there any way to do the above in numpy that is significantly faster than lists? (I would be executing that piece of code 1000s times using array with 100s of elements.)
Update: Following Maarten_vd_Sande's answer:
The following code was used to check the time taken:
import numpy as np
import time
b=np.random.rand(10000000,2)
a=b.tolist()
strt=time.time()
c=b[np.logical_and(b[:,0]>0.5,b[:,1]>0.5)]
for (i,j) in c:
continue
print("Numpy= ",time.time()-strt)
strt=time.time()
for (i,j) in list(filter(lambda m: m[0]>0.5 and m[1]>0.5,a)):
continue
print("List= ",time.time()-strt)
Output:
Numpy= 2.973170042037964
List= 1.91910982131958
>2which results in an empty list. Change it to>0.5and the numpy approach is double as fast (and more than 10x as fast if you remove the empty loop).