-1

Is it possible to filter array without creating new columns?

For example i have this dataframe:

userID   goalsID
25         [1,2,4,5]
188        [3,6]
79         [1,9]

How to filter array by digit "3" in column "goalsID"? I need this result:

userID   goalsID
188       [3,6]

I was trying to transfer data to new columns, but i want to know - is it possible to filter array?

1
  • 1
    out = df[[3 in l for l in df['goalsID']] Commented Aug 28, 2024 at 12:25

1 Answer 1

0

There may be a more specialized function, but a lambda function does the trick

df = pd.DataFrame({'userID':[25,188,79],'goalsID':[[1,2,4,5],[3,6],[1,9]]})

df[df.apply(lambda x: 3 in x.goalsID, axis=1)]

alternatively, df[df.goalsID.apply(lambda x: 3 in x)] will accomplish the same thing.

Sign up to request clarification or add additional context in comments.

3 Comments

Using apply will be much slower than the approach suggested in the duplicate (and comment to answer)
On the scale of 10,000,000 rows with up to 10 items in each list I was at 1.9s for the apply approach vs. 1.6s for the list comprehension approach (which needs an extra closing bracket). So yes, it's faster, but for most use cases it may be within the realm of idiomatic preference.
I was thinking of the DataFrame.apply with axis=1 approach, which is 60x slower in a quick test. Series.apply is more or less just a wrapper around a python loop, so for large inputs it tends to have similar timings (+ some overhead for apply).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.