pd.Series of Lists - Check for Element in Lists

Question

I have a DataFrame in which one column has lists as entries. For a given given value x I want to get a pd.Series of booleans telling me whether x is in each list. For example, given the DataFrame

    index    lists
    0        []
    1        [1, 2]
    2        [1]
    3        [3, 4]

I want to do something like df.lists.contains(1) and get back False, True, True, False.

I am aware I can do this with a Python loop or comprehension, but I would ideally like a Pandas solution analogous to df.mod, df.isin etc.

MaxU - stand with Ukraine · Accepted Answer · 2018-03-30 18:42:29Z

8

In [79]: df['lists'].apply(lambda c: 1 in c)
Out[79]:
0    False
1     True
2     True
3    False
Name: lists, dtype: bool

PS I think a list comprehension solution might be faster in this case

Timing for 40.000 rows DF:

In [81]: df = pd.concat([df] * 10**4, ignore_index=True)

In [82]: df.shape
Out[82]: (40000, 2)

In [83]: %timeit df['lists'].apply(lambda c: 1 in c)
22.5 ms ± 87.8 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [84]: %timeit [1 in x for x in df['lists']]
4.87 ms ± 25.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

edited Mar 30, 2018 at 18:42

answered Mar 30, 2018 at 18:39

MaxU - stand with Ukraine

212k37 gold badges402 silver badges436 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Alex Over a year ago

Oh! I forgot about apply somehow -- how stupid of me! Why would the list comprehension be faster?

MaxU - stand with Ukraine Over a year ago

@Alex, apply is bit optimized for ... loop under the hood, so often list comprehension is faster compared to DataFrame.apply(...)

Collectives™ on Stack Overflow

pd.Series of Lists - Check for Element in Lists

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related