Using If-else to change values in Pandas

Question

I’ve a pd df consists three columns: ID, t, and ind1.

import pandas as pd
dat = {'ID': [1,1,1,1,2,2,2,3,3,3,3,4,4,4,5,5,6,6,6],
        't': [0,1,2,3,0,1,2,0,1,2,3,0,1,2,0,1,0,1,2],
        'ind1' : [1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,0,0,0]
        }

df = pd.DataFrame(dat, columns = ['ID', 't', 'ind1'])

print (df)

What I need to do is to create a new column (res) that

for all ID with ind1==0, then res is zero.
for all ID with ind1==1 and if t==max(t) (group by ID), then res = 1, otherwise zero.

Here’s anticipated output

This is a confused ~ all ==1 means one group should all equal to 1 ? — BENY
– BENY, Commented Aug 21, 2020 at 13:28

BENY · Accepted Answer · 2020-08-20 19:50:17Z

4

Check with groupby with idxmax , then where with transform all

df['res']=df.groupby('ID').t.transform('idxmax').where(df.groupby('ID').ind1.transform('all')).eq(df.index).astype(int)
df
Out[160]: 
    ID  t  ind1  res
0    1  0     1    0
1    1  1     1    0
2    1  2     1    0
3    1  3     1    1
4    2  0     0    0
5    2  1     0    0
6    2  2     0    0
7    3  0     0    0
8    3  1     0    0
9    3  2     0    0
10   3  3     0    0
11   4  0     1    0
12   4  1     1    0
13   4  2     1    1
14   5  0     1    0
15   5  1     1    1
16   6  0     0    0
17   6  1     0    0
18   6  2     0    0

answered Aug 20, 2020 at 19:50

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

sammywemmy · Accepted Answer · 2020-08-20 22:03:53Z

2

This works on the knowledge that the ID column is sorted :

cond1 = df.ind1.eq(0)
cond2 = df.ind1.eq(1) & (df.t.eq(df.groupby("ID").t.transform("max")))

df["res"] = np.select([cond1, cond2], [0, 1], 0)

df


   ID   t ind1 res
0   1   0   1   0
1   1   1   1   0
2   1   2   1   0
3   1   3   1   1
4   2   0   0   0
5   2   1   0   0
6   2   2   0   0
7   3   0   0   0
8   3   1   0   0
9   3   2   0   0
10  3   3   0   0
11  4   0   1   0
12  4   1   1   0
13  4   2   1   1
14  5   0   1   0
15  5   1   1   1
16  6   0   0   0
17  6   1   0   0
18  6   2   0   0

answered Aug 20, 2020 at 22:03

sammywemmy

28.9k4 gold badges21 silver badges35 bronze badges

5 Comments

user9292 Over a year ago

Thanks! Your solution was the fastest!

BENY Over a year ago

have you test it , whether it take the consideration of the all ind1 ==1 ?

sammywemmy Over a year ago

@BEN_YO, if you could point out the faulty row or rows, that would be helpful to me

BENY Over a year ago

@sammywemmy I think different people have different understanding of op's question , so it can be consider a bad question ~, no worry , your answer should be right ~

user9292 Over a year ago

@BEN_YO, yes i did test it, my data has over 14M rows. Thanks for you solution-- I did upvote it.

Space Impact · Accepted Answer · 2020-08-20 19:53:13Z

1

Use groupby.apply:

df['res'] = (df.groupby('ID').apply(lambda x: x['ind1'].eq(1)&x['t'].eq(x['t'].max()))
               .astype(int).reset_index(drop=True))

print(df)
    ID  t  ind1  res
0    1  0     1    0
1    1  1     1    0
2    1  2     1    0
3    1  3     1    1
4    2  0     0    0
5    2  1     0    0
6    2  2     0    0
7    3  0     0    0
8    3  1     0    0
9    3  2     0    0
10   3  3     0    0
11   4  0     1    0
12   4  1     1    0
13   4  2     1    1
14   5  0     1    0
15   5  1     1    1
16   6  0     0    0
17   6  1     0    0
18   6  2     0    0

answered Aug 20, 2020 at 19:53

Space Impact

13.3k26 silver badges51 bronze badges

Collectives™ on Stack Overflow

Using If-else to change values in Pandas

3 Answers 3

Comments

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related