I have a 2-D numpy array with 100,000+ rows. I need to return a subset of those rows (and I need to perform that operations many 1,000s of times, so efficiency is important).
A mock-up example is like this:
import numpy as np
a = np.array([[1,5.5],
[2,4.5],
[3,9.0],
[4,8.01]])
b = np.array([2,4])
So...I want to return the array from a with rows identified in the first column by b:
c=[[2,4.5],
[4,8.01]]
The difference, of course, is that there are many more rows in both a and b, so I'd like to avoid looping. Also, I played with making a dictionary and using np.nonzero but still am a bit stumped.
Thanks in advance for any ideas!
EDIT: Note that, in this case, b are identifiers rather than indices. Here's a revised example:
import numpy as np
a = np.array([[102,5.5],
[204,4.5],
[343,9.0],
[40,8.01]])
b = np.array([102,343])
And I want to return:
c = [[102,5.5],
[343,9.0]]