2

With python Pandas, I'm trying to filter out the data that contains the specified value in the array, I try to use python in to filter value, but it's not working, I want to know if there is a way to achieve such a function without looping

import pandas as pd

df = pd.DataFrame({'A' : [1,2,3,4], 'B' : [[1, 2, 3], [2, 3], [3], [1, 2, 3]]})
df = 1 in df['custom_test_type']

    A   B
0   1   [1, 2, 3]
1   2   [2, 3]
2   3   [3]
3   4   [1, 2, 3]

I'm try to filter 1 in row B, so expected output will be:

    A   B
0   1   [1, 2, 3]
3   4   [1, 2, 3]

but the output always be True

due to my limited ability, Any help or explanation is welcome! Thank you.

1 Answer 1

1

You need to use a loop/list comprehension:

out = df[[1 in l for l in df['B']]]

A pandas version would be more verbose and less efficient:

out = df[df['B'].explode().eq(1).groupby(level=0).any()]

Output:

   A          B
0  1  [1, 2, 3]
3  4  [1, 2, 3]
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you for your help, and I use pandas to improve performance, so want to know is it have some possible, Don't go through with for loops
Hi @mozway, If my data has a thousand columns, it would be more verbose and less efficient?
Here you don't have a choice, you cannot vectorize operations with a Series of lists. The list comprehension is very likely your best option.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.