Remove duplicate rows based on previous rows' values in a specific column

Question

I have a dataframe similar to the following example:

import pandas as pd
data = pd.DataFrame(data={'col1': [1,2,3,4,5,6,7,8,9], 'col2': [1.55,1.55,1.55,1.8,1.9,1.9,1.9,2.1,2.1]})

In the second column, col2, several duplicate values can be seen, 3 times 1.55, 3 times 1.9 and 2 times 2.1. What I need to do is remove all rows which are a duplicate of its previous row. So, the first rows are the ones I'd like to keep. In this example, this would be the rows with col2 value 1, 4, 5, 8 giving the following dataframe as my desired output:

clean_data = pd.DataFrame(data={'col1': [1,4,5,8], 'col2': [1.55,1.8,1.9,2.1]})

What is the best way to go about this for a dataframe which is much larger (in terms of rows) than this small example?

Do you want to remove rows that are a duplicate of just the immediately previous rows, or rows that are a duplicate of any of the previous rows? — Ben Grossmann
– Ben Grossmann, Commented Nov 16, 2022 at 17:32
Only of the immediate previous row, not of all previous rows. Sorry for the unclear description. — Bas R
– Bas R, Commented Nov 16, 2022 at 17:33
Rereading your question, I think your intent is clear; my mistake. — Ben Grossmann
– Ben Grossmann, Commented Nov 16, 2022 at 17:34
For posterity: if you want to remove rows where the col2 entry is a duplicate of any of the preceding values, you can do clean_data = data.loc[~data['col2'].duplicated(),:] — Ben Grossmann
– Ben Grossmann, Commented Nov 16, 2022 at 17:36

Nuri Taş · Accepted Answer · 2022-11-16 17:29:00Z

1

You can use shift:

data.loc[data['col2'] != data['col2'].shift(1)]

answered Nov 16, 2022 at 17:29

Nuri Taş

3,8552 gold badges8 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Remove duplicate rows based on previous rows' values in a specific column

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related