I'm trying to create a new dataframe column that acts as a running variable that resets to zero or "passes" under certain conditions. Below is a simplified example of what I'm looking to accomplish. Let's say I'm trying to quit drinking coffee and I'm tracking the number of days in a row i've gone without drinking any. On days where I forgot to make note of whether I drank coffee, I put "forgot", and my tally does not get influenced.
Below is how i'm currently accomplishing this, though I suspect there's a much more efficient way of going about it.
Thanks in advance!
import pandas as pd
Day = [1,2,3,4,5,6,7,8,9,10,11]
DrankCoffee = ['no','no','forgot','yes','no','no','no','no','no','yes','no']
df = pd.DataFrame(list(zip(Day,DrankCoffee)), columns=['Day','DrankCoffee'])
df['Streak'] = 0
s = 0
for (index,row) in df.iterrows():
if row['DrankCoffee'] == 'no':
s += 1
if row['DrankCoffee'] == 'yes':
s = 0
else:
pass
df.loc[index,'Streak'] = s

zero_streak. If the next entry isyesthen just add +1 from thezero_streakindex to current index. Ifnothen set the new row for streak as 0 and update yourzero_streakto the new index