Filter column of lists in pandas dataframe

Question

So I want to filter a column of lists which should only contain specific items.

This my original table:

id	code
1	[Hes3086, Hes3440, Hes3220]
2	[Hes3440, Nee8900]
3	[Hes1337, Hes3440]
4	[Nee8900, Hes3440]
5	[Hes1337, Nee8900]
6	[Hes3220, Nee8900]
7	[Hes3220, Nee8900, Hes3440]

I want the rows which only have specific items in the lists: Hes3440, Nee8900, Hes3220

Which should generate the following output:

id	code
2	[Hes3440, Nee8900]
4	[Nee8900, Hes3440]
6	[Hes3220, Nee8900]
7	[Hes3220, Nee8900, Hes3440]

I am able to filter the dataset by making sure that at least one of the desired items is in each row, but this is not what I want.

Would appreciate any help!

thanks, M

jezrael · Accepted Answer · 2022-04-08 12:07:13Z

1

Use issubset in boolean indexing with Series.map:

L = ['Hes3440','Nee8900','Hes3220']

df = df[df.code.map(lambda x: set(x).issubset(L))]
print (df)
   id                         code
1   2           [Hes3440, Nee8900]
3   4           [Nee8900, Hes3440]
5   6           [Hes3220, Nee8900]
6   7  [Hes3220, Nee8900, Hes3440]

List comprehension alternative:

df = df[[set(x).issubset(L) for x in df.code]]

answered Apr 8, 2022 at 12:07

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Filter column of lists in pandas dataframe

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related