Your input data is a plain Python list of Numpy arrays of unequal length, thus it can't be simply converted to a 2D Numpy array, and so it can't be directly processed by Numpy. But it can be process using the usual Python list processing tools.
Here's a list comprehension that uses numpy.isin to test if a row contains any of (3, 7, 8). We first use simple == testing to see if the row contains 10, and only call isin if it does so; the Python and operator will not evaluate its second operand if the first operand is false-ish.
We use np.any to see if any row item passes each test. np.any returns a Boolean value of False or True, but we can pass those values to int to convert them to 0 or 1.
import numpy as np
data = [
np.array([10, 1, 7, 3]), np.array([0, 14, 12, 13]),
np.array([3, 10, 7, 8]), np.array([7, 5]),
np.array([5, 12, 3]), np.array([14, 8, 10]),
]
mask = np.array([3, 7, 8])
result = [int(np.any(row==10) and np.any(np.isin(row, mask)))
for row in data]
print(result)
output
[1, 0, 1, 0, 0, 1]
I've just performed some timeit tests. Curiously, Reblochon Masque's code is faster on the data given in the question, presumably because of the short-circuiting behaviour of plain Python any, and & or. Also, it appears that numpy.in1d is faster than numpy.isin, even though the docs recommend using the latter in new code.
Here's a new version that's about 10% slower than Reblochon's.
mask = np.array([3, 7, 8])
result = [int(any(row==10) and any(np.in1d(row, mask)))
for row in data]
Of course, the true speed on large amounts of real data may vary from what my tests indicate. And time may not be an issue: even on my slow old 32 bit single core 2GHz machine I can process the data in the question almost 3000 times in one second.
hpaulj has suggested an even faster way. Here's some timeit test info, comparing the various versions. These tests were performed on my old machine, YMMV.
import numpy as np
from timeit import Timer
the_data = [
np.array([10, 1, 7, 3]), np.array([0, 14, 12, 13]),
np.array([3, 10, 7, 8]), np.array([7, 5]),
np.array([5, 12, 3]), np.array([14, 8, 10]),
]
def rebloch0(data):
result = []
for output in data:
result.append(1 if np.where((any(output == 10) and any(output == 7)) or
(any(output == 10) and any(output == 3)) or
(any(output == 10) and any(output == 8)), 1, 0) == True else 0)
return result
def rebloch1(data):
result = []
for output in data:
result.append(1 if np.where((any(output == 10) and any(output == 7)) or
(any(output == 10) and any(output == 3)) or
(any(output == 10) and any(output == 8)), 1, 0) else 0)
return result
def pm2r0(data):
mask = np.array([3, 7, 8])
return [int(np.any(row==10) and np.any(np.isin(row, mask)))
for row in data]
def pm2r1(data):
mask = np.array([3, 7, 8])
return [int(any(row==10) and any(np.in1d(row, mask)))
for row in data]
def hpaulj0(data):
mask=np.array([3, 7, 8])
return [int(any(row==10) and any((row[:, None]==mask).flat))
for row in data]
def hpaulj1(data, mask=np.array([3, 7, 8])):
return [int(any(row==10) and any((row[:, None]==mask).flat))
for row in data]
functions = (
rebloch0,
rebloch1,
pm2r0,
pm2r1,
hpaulj0,
hpaulj1,
)
# Verify that all functions give the same result
for func in functions:
print('{:8}: {}'.format(func.__name__, func(the_data)))
print()
def time_test(loops, data):
timings = []
for func in functions:
t = Timer(lambda: func(data))
result = sorted(t.repeat(3, loops))
timings.append((result, func.__name__))
timings.sort()
for result, name in timings:
print('{:8}: {:.6f}, {:.6f}, {:.6f}'.format(name, *result))
print()
time_test(1000, the_data)
typical output
rebloch0: [1, 0, 1, 0, 0, 1]
rebloch1: [1, 0, 1, 0, 0, 1]
pm2r0 : [1, 0, 1, 0, 0, 1]
pm2r1 : [1, 0, 1, 0, 0, 1]
hpaulj0 : [1, 0, 1, 0, 0, 1]
hpaulj1 : [1, 0, 1, 0, 0, 1]
hpaulj1 : 0.140421, 0.154910, 0.156105
hpaulj0 : 0.154224, 0.154822, 0.167101
rebloch1: 0.281700, 0.282764, 0.284599
rebloch0: 0.339693, 0.359127, 0.375715
pm2r1 : 0.367677, 0.368826, 0.371599
pm2r0 : 0.626043, 0.628232, 0.670199
Nice work, hpaulj!
output?[14, 8, 10]matches? It has a 10, but no 7 or 3.output. But we do need you to make the code you've shown us unambiguous and self-consistent. You keep callingoutputan array, but it looks like a list. And according to the code you just added,outputis a list, not an array.