Get index of column where consecutive values are zero in pandas df

Question

I have a pandas dataframe like below in Python


       user_id  2020-03  2020-04  2020-05  2020-06  2020-07  2020-08  2020-09  2020-10  2020-11  2020-12  2021-01  2021-02  2021-03    
0            5     20.0     0           0     38.0     45.0     54.0     83.0    107.0    129.0    146.0    174.0    136.0     33.0   
1            7      5.0     13.0     26.0     27.0     19.0     13.0      7.0     14.0     21.0     17.0     13.0      5.0      5.0   
2           14      0.0      7.0     25.0     22.0     60.0     13.0      1.0     25.0     49.0     16.0      6.0      0.0      0.0   
3           16      0.0      2.0      0.0      0.0      0.0      0.0      0.0      0.0      0.0      0.0      0.0      0.0      0.0

I want to know the first month (column) where there are two consecutive columns with the value 0. So for example:


       user_id  2020-03  2020-04  2020-05  2020-06  2020-07  2020-08  2020-09  2020-10  2020-11  2020-12  2021-01  2021-02  2021-03  first_month   
0            5     20.0     0           0     38.0     45.0     54.0     83.0    107.0    129.0    146.0    174.0    136.0     33.0   2020-04
1            7      5.0     13.0     26.0     27.0     19.0     13.0      7.0     14.0     21.0     17.0     13.0      5.0      5.0   -
2           14      0.0      7.0     25.0     22.0     60.0     13.0      1.0     25.0     49.0     16.0      6.0      0.0      0.0   2021-02
3           16      0.0      2.0      0.0      0.0      0.0      0.0      0.0      0.0      0.0      0.0      0.0      0.0      0.0   2020-05

Can anyone help me?

anky · Accepted Answer · 2021-03-15 16:18:31Z

You can do this with df.shift on axis=1 and then checking with any with the condition with df.where

u  = df.drop('user_id',1)
c = (u.eq(0)&u.shift(-1,axis=1).eq(0))
df['first_month'] = c.idxmax(1).where(c.any(1)) #c.idxmax(1).where(c.any(1),'-')

print(df)

    user_id  2020-03  2020-04  2020-05  2020-06  2020-07  2020-08  2020-09  \
0        5     20.0      0.0      0.0     38.0     45.0     54.0     83.0   
1        7      5.0     13.0     26.0     27.0     19.0     13.0      7.0   
2       14      0.0      7.0     25.0     22.0     60.0     13.0      1.0   
3       16      0.0      2.0      0.0      0.0      0.0      0.0      0.0   

   2020-10  2020-11  2020-12  2021-01  2021-02  2021-03 first_month  
0    107.0    129.0    146.0    174.0    136.0     33.0     2020-04  
1     14.0     21.0     17.0     13.0      5.0      5.0         NaN  
2     25.0     49.0     16.0      6.0      0.0      0.0     2021-02  
3      0.0      0.0      0.0      0.0      0.0      0.0     2020-05

Quang Hoang · Accepted Answer · 2021-03-15 16:13:31Z

3

You can try shift, and idxmax:

s = df.iloc[:,1:].eq(0)
s = (s + s.shift(-1, fill_value=0,axis=1)) == 2

df['first_month'] = np.where(s.any(1), s.idxmax(1), '-')

Output (just np.where part):

array(['2020-04', '-', '2021-02', '2020-05'], dtype=object)

answered Mar 15, 2021 at 16:13

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

Collectives™ on Stack Overflow

Get index of column where consecutive values are zero in pandas df

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related