2

I have looked through various sites and SO posts.Seems easy but somehow i am stuck with this.I am using

print frame.loc[(frame['RR'].str.contains("^[^123]", na=False)), 'RR'].isin(series1.str.slice(1))

to get

3     True
4    False
8    False
Name: RR, dtype: bool

Now,somehow i want the indexes only so that i can use that in dataframe.drop. Basically all the indexes where value is True , i have to grab indexes and drop them.Is there any other way as well without using indexes?

1 Answer 1

1

You are testing two conditions on the same column so these can be combined (and negated):

frame[~((frame['RR'].str.contains("^[^123]", na=False)) & (frame['RR'].isin(series1.str.slice(1))))]

Here, after ~ operator, it checks whether a particular row satisfies both conditions - same as the boolean array you get in the end. With ~, you turn True's to False's and False's to True's. Finally, frame[condition] returns the rows that satisfy the final condition with boolean indexing.

In a more readable format:

condition1 = frame['RR'].str.contains("^[^123]", na=False)
condition2 = frame['RR'].isin(series1.str.slice(1))
frame[~(condition1 & condition2)]

As an alternative (requires 0.18.0), you can get the indices of the True elements with:

frame.loc[(frame['RR'].str.contains("^[^123]", na=False)), 'RR'].isin(series1.str.slice(1))[lambda df: df].index
Sign up to request clarification or add additional context in comments.

4 Comments

So which of the two would be faster considering frames would have a million entries
Most probably the first one. The second one first returns another frame and tests another condition just to give the indices - which then need to be used in another method (drop). I've tested on a small dataframe and the first one is 2x faster (without considering the drop part). Of course this may change based on the entries you have in series1, in frame['RR'], etc.
Cool!!! thanx a lot!!! can you share some good resource on pandas? the info is all splattered across....
You are welcome. My main source are Q&A's here and the docs. There is also Wes McKinney's book (although a little dates, a new edition is about to come). Lastly, Tom Augspurger's Modern Pandas series is quite nice.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.