creating conditions on np.where in Pandas based on value in current column

Question

I have a dataframe in Pandas (subset below).

DATE       IN 200D_MA   TEST    
10/30/2013  0   1        0  
10/31/2013  0   1        0  
11/1/2013   1   1        1  IN & 200D_MA both =1, results 1
11/4/2013   0   1        1  PREVIOUS TEST ROW =1 & 200DM_A = 1, TEST ans=1
11/5/2013   0   1        1  PREVIOUS TEST ROW =1 & 200DM_A = 1, TEST ans=1
11/6/2013   0   1        1  
11/7/2013   0   1        1  
11/8/2013   0   1        1  
11/11/2013  0   0        0  PREVIOUS TEST ROW =1 & 200DM_A = 0, TEST ans=0

This is easy to do in excel so I thought it would be easy to do in python. I have this code using nested np.where formulas

df3['TEST'] = np.where( (df3['IN'] == 1) & (df3['200D_MA'] == 1),1,\
                       np.where( (df3['TEST'].shift(-1) == 1)\
                       & (df3['200D_MA'] == 1),1,0))

but it throws a KeyError: 'IN' > presumably because I am using a condition from column that has not been created yet. Can anyone help me figure out how to do this?

BENY · Accepted Answer · 2018-05-04 18:56:48Z

1

Seems like you need condition ffill

df['TEST']=df.loc[df.IN==1,'IN']
df.loc[df['200D_MA']==1,'TEST']=df.loc[df['200D_MA']==1,'TEST'].ffill()
df.fillna(0,inplace=True)
df.TEST=df.TEST.astype(int)
df
Out[349]: 
         DATE  IN  200D_MA  TEST
0  10/30/2013   0        1     0
1  10/31/2013   0        1     0
2   11/1/2013   1        1     1
3   11/4/2013   0        1     1
4   11/5/2013   0        1     1
5   11/6/2013   0        1     1
6   11/7/2013   0        1     1
7   11/8/2013   0        1     1
8  11/11/2013   0        0     0

answered May 4, 2018 at 18:56

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Scott Boston · Accepted Answer · 2018-05-04 19:04:03Z

1

I think you can use rolling to calculate previous TEST row.

df['TEST'] = (df['IN 200D_MA'] & df['IN 200D_MA'].rolling(2).min().shift(1)).astype(int)

Output:

            DATE  IN 200D_MA  TEST
10/30/2013     0           1     0
10/31/2013     0           1     0
11/1/2013      1           1     1
11/4/2013      0           1     1
11/5/2013      0           1     1
11/6/2013      0           1     1
11/7/2013      0           1     1
11/8/2013      0           1     1
11/11/2013     0           0     0

answered May 4, 2018 at 19:04

Scott Boston

154k15 gold badges160 silver badges207 bronze badges

3 Comments

J Westwood Over a year ago

They both seem to work, not sure how to give proper credit and which one to mark as correct. Thank you both

Scott Boston Over a year ago

First answer, fasest/most efficent solution, the one you understand the best, or just plain flip coin. :) Nah, credit Wen.

J Westwood Over a year ago

Actually with more tests there are certain conditions where they solutions dont always work. I dont know how to add code in comment but @Wen code if your 200D_MA columns goes to 0 then 1 your solution will show a 1 in TEST column when it should only show 1 if IN column had prior 1.

Collectives™ on Stack Overflow

creating conditions on np.where in Pandas based on value in current column

2 Answers 2

Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related