I have a Pandas dataframe where the values are lists:
import pandas as pd
DF = pd.DataFrame({'X':[[1, 5], [1, 2]], 'Y':[[1, 2, 5], [1, 3, 5]]})
DF
X Y
0 [1, 5] [1, 2, 5]
1 [1, 2] [1, 3, 5]
I want to check if the lists in X are subsets of the lists in Y. With individual lists, we can do this using set(x).issubset(set(y)). But how would we do this across Pandas data columns?
So far, the only thing I've come up with is to use the individual lists as a workaround, then convert the result back to Pandas. Seems a bit complicated for this task:
foo = [set(DF['X'][i]).issubset(set(DF['Y'][i])) for i in range(len(DF['X']))]
foo = pd.DataFrame(foo)
foo.columns = ['x_sub_y']
pd.merge(DF, foo, how = 'inner', left_index = True, right_index = True)
X Y x_sub_y
0 [1, 5] [1, 2, 5] True
1 [1, 2] [1, 3, 5] False
Is there a easier way to achieve this? Possibly using .map or .apply?
np.vectoriseas well: stackoverflow.com/a/46163829/4909087