remove consecutive duplicate with a certain values from dataframe

Question

I have a dataframe that I want to remove duplicate values that are consecutive if their values are 'true' or 'false'. I know how to remove duplicate consecutive rows but not sure how to remove only values that have only values of 'true' or 'false' and not remove all the consecutive duplicate values.

cols =['col_b']
df = df.loc[(df[cols].shift() != df[cols]).any(axis=1)]

For example:

col_a   col_b
21     'true'
25      'true'
76      'abc'
89      'ttt'
99      'ttt'
210     'false'
211     'false'
212     'false'

And I need the following result:

col_a   col_b
21     'true'
76      'abc'
89      'ttt'
99      'ttt'
210     'false'

but it removes 'ttt' values which I need them.

BENY · Accepted Answer · 2020-09-22 15:45:51Z

3

Let us try use shift with cumsum create the group, then do duplicated + the condition of false

s1 = df.col_b.ne(df.col_b.shift()).cumsum().duplicated()
s2 = df.col_b.isin(["'true'","'false'"])
df=df[~(s1&s2)]

df
   col_a    col_b
0     21   'true'
2     76    'abc'
3     89    'ttt'
4     99    'ttt'
5    210  'false'

edited Sep 22, 2020 at 15:45

answered Sep 22, 2020 at 15:29

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Elham Over a year ago

it should remove both duplicates of true and false not just false value

JC Royal Fish · Accepted Answer · 2020-09-22 15:56:23Z

1

All you need is adding an additional filter for [true, false].

>>> df["before"] = df["col_b"].shift()
>>> df
   col_a  col_b before
0     21   true    NaN
1     25   true   true
2     76    abc   true
3     89    ttt    abc
4     99    ttt    ttt
5    210  false    ttt
6    211  false  false
7    212  false  false
>>> df[~((df["col_b"] == df["before"]) & (df["before"].isin(["true", "false"])))].drop(["before"], axis="columns")
   col_a  col_b
0     21   true
2     76    abc
3     89    ttt
4     99    ttt
5    210  false

answered Sep 22, 2020 at 15:56

JC Royal Fish

865 bronze badges

Collectives™ on Stack Overflow

remove consecutive duplicate with a certain values from dataframe

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related