2

I have a dataframe in Pandas (subset below).

DATE       IN 200D_MA   TEST    
10/30/2013  0   1        0  
10/31/2013  0   1        0  
11/1/2013   1   1        1  IN & 200D_MA both =1, results 1
11/4/2013   0   1        1  PREVIOUS TEST ROW =1 & 200DM_A = 1, TEST ans=1
11/5/2013   0   1        1  PREVIOUS TEST ROW =1 & 200DM_A = 1, TEST ans=1
11/6/2013   0   1        1  
11/7/2013   0   1        1  
11/8/2013   0   1        1  
11/11/2013  0   0        0  PREVIOUS TEST ROW =1 & 200DM_A = 0, TEST ans=0

This is easy to do in excel so I thought it would be easy to do in python. I have this code using nested np.where formulas

df3['TEST'] = np.where( (df3['IN'] == 1) & (df3['200D_MA'] == 1),1,\
                       np.where( (df3['TEST'].shift(-1) == 1)\
                       & (df3['200D_MA'] == 1),1,0)) 

but it throws a KeyError: 'IN' > presumably because I am using a condition from column that has not been created yet. Can anyone help me figure out how to do this?

2 Answers 2

1

Seems like you need condition ffill

df['TEST']=df.loc[df.IN==1,'IN']
df.loc[df['200D_MA']==1,'TEST']=df.loc[df['200D_MA']==1,'TEST'].ffill()
df.fillna(0,inplace=True)
df.TEST=df.TEST.astype(int)
df
Out[349]: 
         DATE  IN  200D_MA  TEST
0  10/30/2013   0        1     0
1  10/31/2013   0        1     0
2   11/1/2013   1        1     1
3   11/4/2013   0        1     1
4   11/5/2013   0        1     1
5   11/6/2013   0        1     1
6   11/7/2013   0        1     1
7   11/8/2013   0        1     1
8  11/11/2013   0        0     0
Sign up to request clarification or add additional context in comments.

Comments

1

I think you can use rolling to calculate previous TEST row.

df['TEST'] = (df['IN 200D_MA'] & df['IN 200D_MA'].rolling(2).min().shift(1)).astype(int)

Output:

            DATE  IN 200D_MA  TEST
10/30/2013     0           1     0
10/31/2013     0           1     0
11/1/2013      1           1     1
11/4/2013      0           1     1
11/5/2013      0           1     1
11/6/2013      0           1     1
11/7/2013      0           1     1
11/8/2013      0           1     1
11/11/2013     0           0     0

3 Comments

They both seem to work, not sure how to give proper credit and which one to mark as correct. Thank you both
First answer, fasest/most efficent solution, the one you understand the best, or just plain flip coin. :) Nah, credit Wen.
Actually with more tests there are certain conditions where they solutions dont always work. I dont know how to add code in comment but @Wen code if your 200D_MA columns goes to 0 then 1 your solution will show a 1 in TEST column when it should only show 1 if IN column had prior 1.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.