Pandas filter rows based on values from 2 columns in another dataframe

Question

Assume we have 2 dfs.

df1:

        Product   Name1    Name2
    0   ABC       aaa      bbb
    1   ASD       sss      ddd
    2   ZCX       xxx      NaN
    3   QWE       qqq      NaN
    4   IOP       iii      ooo

df2:

        Title     Col1     Col2
    0   aaaLKJ    ...      ...
    1   zzzMNB    ...      ...
    2   dddKJH    ...      ...
    3   ooo123    ...      ...
    4   aaa098    ...      ...
    5   ttt676    ...      ...

The output df should consists all of the df2 rows, in which the 'Title' starts with either 'Name1' or 'Name2' of df1.

        Title     Col1     Col2
    0   aaaLKJ    ...      ...
    2   dddKJH    ...      ...
    3   ooo123    ...      ...
    4   aaa098    ...      ...

Please let me know how this could be achieved. I have tried several ways with df.loc, isin, str.contains, but without luck ...

yatu · Accepted Answer · 2020-09-04 12:10:58Z

2

You could build a list with all names from both columns in df1, and use Series.str.startswith on Title in df2, which also works with a tuple of strings, to find those which start with any term in names:

names = df1.filter(like='Name').stack().to_numpy()
# array(['aaa', 'bbb', 'sss', 'ddd', 'xxx', 'qqq', 'iii', 'ooo'], dtype=object)
df2 = df2[df2.Title.str.startswith(tuple(names))]

print(df2)
    Title Col1 Col2
0  aaaLKJ  ...  ...
2  dddKJH  ...  ...
3  ooo123  ...  ...
4  aaa098  ...  ...

edited Sep 4, 2020 at 12:10

answered Sep 4, 2020 at 12:04

yatu

88.7k12 gold badges93 silver badges148 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Pandas filter rows based on values from 2 columns in another dataframe

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related