I'm trying to calculate running difference on the date column depending on "event column".
So, to add another column with date difference between 1 in event column (there only 0 and 1).
Spo far I came to this half-working crappy solution
Dataframe:
df = pd.DataFrame({'date':[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17],'event':[0,1,0,0,1,0,0,0,0,1,0,0,0,0,0,1,0],'duration':None})
Code:
x = df.loc[df['event']==1, 'date']
k = 0
for i in range(len(x)):
df.loc[k:x.index[i], 'duration'] = x.iloc[i] - k
k = x.index[i]
But I'm sure there is a more elegant solution. Thanks for any advice.
Output format:
+------+-------+----------+
| date | event | duration |
+------+-------+----------+
| 1 | 0 | 3 |
| 2 | 0 | 3 |
| 3 | 1 | 3 |
| 4 | 0 | 6 |
| 5 | 0 | 6 |
| 6 | 0 | 6 |
| 7 | 0 | 6 |
| 8 | 0 | 6 |
| 9 | 1 | 6 |
| 10 | 0 | 4 |
| 11 | 0 | 4 |
| 12 | 0 | 4 |
| 13 | 1 | 4 |
| 14 | 0 | 2 |
| 15 | 1 | 2 |
+------+-------+----------+