0

I have a dataframe which has Student IDs and their respective stage score.

I want to find the stage where student got dropped i.e. in which stage did the first zero score appeared. And then update the corresponding stage flag = 1 .Below is the sample data :

StuID | Stage1 | Stage2 | Stage3 | Stage4  | S1Flag |S2Flag |S3Flag | S4Flag
Ak    | 80.1   |  23.3  |    0   |    0    |   0    |  0    |  1    | 0 
XF    |   0    |  0     |    0   |    0    |   1    |  0    |  0    | 0
WE    |  23    |  34    |    43  |    34   |   0    |  0    |  0    | 0

For above Data , for StuID = 'AK' , the first zero appeared in stage 3 so S3 Flag is changed to 1 .For StuID = 'XF' , the first zero appeared in stage 1 , so S1 flag is updated as 1 . And goes o for other rows as well.

2
  • Where exactly are you facing the issue/error? Commented May 31, 2018 at 8:02
  • Hi Surya , I need help in writing the code . I sense some loops and updates but doesn't no how to implement them in python Commented May 31, 2018 at 8:08

1 Answer 1

1

First filter only Stage columns, compare by 0, get cumulative sum, so if compare by 1 get mask for first 0:

m = df.filter(like='Stage').eq(0).cumsum(axis=1).eq(1)
print (m)
   Stage1  Stage2  Stage3  Stage4
0   False   False    True   False
1    True   False   False   False
2   False   False   False   False

Then filter Flag columns and set 1 by mask:

cols = df.filter(like='Flag').columns
df[cols] = df[cols].mask(m.values, 1)
print (df)
  StuID  Stage1  Stage2  Stage3  Stage4  S1Flag  S2Flag  S3Flag  S4Flag
0    Ak    80.1    23.3       0       0       0       0       1       0
1    XF     0.0     0.0       0       0       1       0       0       0
2    WE    23.0    34.0      43      34       0       0       0       0 

Details:

  print (df.filter(like='Stage'))
   Stage1  Stage2  Stage3  Stage4
0    80.1    23.3       0       0
1     0.0     0.0       0       0
2    23.0    34.0      43      34

print (df.filter(like='Stage').eq(0))
   Stage1  Stage2  Stage3  Stage4
0   False   False    True    True
1    True    True    True    True
2   False   False   False   False

print (df.filter(like='Stage').eq(0).cumsum(1))
   Stage1  Stage2  Stage3  Stage4
0       0       0       1       2
1       1       2       3       4
2       0       0       0       0

print (df.filter(like='Stage').eq(0).cumsum(1).eq(1))
   Stage1  Stage2  Stage3  Stage4
0   False   False    True   False
1    True   False   False   False
2   False   False   False   False
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks Jez, it worked. However , I need it dig in as to what .eq, .cumsum and mask functions are doing .
@Mighty - added details for better explanation it.
Thank you so much :) @jezrael
@Mighty - Glad can help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.