Python Pandas: Create Column That Acts As A Conditional Running Variable

Question

I'm trying to create a new dataframe column that acts as a running variable that resets to zero or "passes" under certain conditions. Below is a simplified example of what I'm looking to accomplish. Let's say I'm trying to quit drinking coffee and I'm tracking the number of days in a row i've gone without drinking any. On days where I forgot to make note of whether I drank coffee, I put "forgot", and my tally does not get influenced.

Below is how i'm currently accomplishing this, though I suspect there's a much more efficient way of going about it.

Thanks in advance!

import pandas as pd

Day = [1,2,3,4,5,6,7,8,9,10,11]  
DrankCoffee = ['no','no','forgot','yes','no','no','no','no','no','yes','no']

df = pd.DataFrame(list(zip(Day,DrankCoffee)), columns=['Day','DrankCoffee'])

df['Streak'] = 0  

s = 0

for (index,row) in df.iterrows():
   if row['DrankCoffee'] == 'no':
      s += 1
   if row['DrankCoffee'] == 'yes':
      s = 0
   else:
      pass

   df.loc[index,'Streak'] = s

Could you give more details of how the problem is structured? Because it seems you could use iloc and keep track of the last 0 in your streak column. Let us call it zero_streak. If the next entry is yes then just add +1 from the zero_streak index to current index. If no then set the new row for streak as 0 and update your zero_streak to the new index — Haris Nadeem
– Haris Nadeem, Commented May 2, 2018 at 21:34

Maarten Fabré · Accepted Answer · 2018-05-02 21:35:44Z

4

you can use groupby.transform

for each streak, what you're looking for is something like this:

def my_func(group):
    return (group == 'no').cumsum()

you can divide the different streak with simple comparison and cumsum

streak = (df['DrankCoffee'] == 'yes').cumsum()

then apply the transform

df['Streak'] = df.groupby(streak)['DrankCoffee'].transform(my_func)

answered May 2, 2018 at 21:35

Maarten Fabré

7,0781 gold badge19 silver badges37 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

crowsnest Over a year ago

Thanks!! All of the responses are great, but I think this one was the easiest for me to understand the step by step process.

BENY · Accepted Answer · 2018-05-02 21:34:14Z

3

You need firstly map you DrankCoffee to [0,1](Base on my understanding yes and forgot should be 0 and no is 1), then we just do groupby cumsum to create the group key , when there is yes we start a new round for count those evens

df.DrankCoffee.replace({'no':1,'forgot':0,'yes':0}).groupby((df.DrankCoffee=='yes').cumsum()).cumsum()
Out[111]: 
0     1
1     2
2     2
3     0
4     1
5     2
6     3
7     4
8     5
9     0
10    1
Name: DrankCoffee, dtype: int64

answered May 2, 2018 at 21:34

BENY

324k22 gold badges176 silver badges250 bronze badges

1 Comment

Maarten Fabré Over a year ago

mapping DrankCoffee to 0,1 can be easier with == 'no'

Scott Boston · Accepted Answer · 2018-05-02 21:42:24Z

2

Use:

df['Streak'] = df.assign(streak=df['DrankCoffee'].eq('no'))\
                 .groupby(df['DrankCoffee'].eq('yes').cumsum())['streak'].cumsum().astype(int)

Output:

    Day DrankCoffee  Streak
0     1          no       1
1     2          no       2
2     3      forgot       2
3     4         yes       0
4     5          no       1
5     6          no       2
6     7          no       3
7     8          no       4
8     9          no       5
9    10         yes       0
10   11          no       1

First, create streak increment when 'no' then True.
Next, create streak when 'yes' start a new streak using cumsum().
Lastly, use cumsum to count streak increment in streaks with cumsum().

answered May 2, 2018 at 21:42

Scott Boston

154k15 gold badges160 silver badges207 bronze badges

Collectives™ on Stack Overflow

Python Pandas: Create Column That Acts As A Conditional Running Variable

3 Answers 3

1 Comment

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related