Suppose I have a df
t status
1 ok
2 ok
3 ok
4 closed
5 closed
6 closed
7 bad input
8 bad input
9 closed
10 closed
11 ok
12 ok
13 closed
14 closed
I want to identify at what time "closed" appears and for how long.
So the result should be
t status index
1 ok 0
2 ok 0
3 ok 0
4 closed 1
5 closed 1
6 closed 1
7 bad input 0
8 bad input 0
9 closed 2
10 closed 2
11 ok 0
12 ok 0
13 closed 3
14 closed 3
I tried standard "for loop" approach but it is not feasible for large dataframe. I am thinking of a solution using numpy where and repeat
np.where(tmp['status']=='Closed', 1, 0)
I am stuck on adding 1 everytime "Closed" reappears