Say I have the following dataframe:
>>> import pandas as pd
>>> d=pd.DataFrame()
>>> d['A']=['12345','12354','76','4']
>>> d['B']=['4442','2345','33','5']
>>> d['C']=['5553','4343','33','5']
>>> d
A B C
0 12345 4442 5553
1 12354 2345 4343
2 76 33 33
3 4 5 5
And say I have 3 values of interest:
>>> vals=['123','76']
I am interested in determining which values in my dataframe start with any of the values in my list. There are 3 cases in my example: (0,A) starts with 123; (1,A) starts with 123; and (2,A) starts with 76.
Is there a way I can do this without looping through each of my values?
If I were interested in matching values exactly I could just do:
>>> d.isin(vals)
A B C
0 False False False
1 False False False
2 True False False
3 False False False
>>>
And if I was interested in whether the values start with 1 particular value I could do:
>>> d.applymap(lambda x:x.startswith('123'))
A B C
0 True False False
1 True False False
2 False False False
3 False False False
>>>
But how can I combine these two to find any value that starts with any value in my list?
df.apply(lambda x: x.str.contains('|'.join(vals)))