0

I have a dataframe which looks like this

df= 

    Time                  x          y
0   2018-09-13 01:17:00  5.0        0.0
1   2018-09-13 02:17:00  9.0        0.0
2   2018-09-13 03:17:00  2.0        1.0
3   2018-09-13 04:17:00  1.0        0.0

....... I want to iterate through this whole dataframe and calculate a new variable z. The value of z would be z= z[prev]+ x-y

for example, the final output would be

Time                         z
0   2018-09-13 01:17:00      5    #[0+5-0]
1   2018-09-13 02:17:00      14   #[5+9-0]
2   2018-09-13 03:17:00      15   #[14+2-1]
3   2018-09-13 04:17:00      16   #[15+1-0]

.....

I am finding it difficult to iterate over the time series data.

I have tried the following but it is not working.

for i,row in df.iterrows():
    z=0
    row['z']=row['z']+row['x']-row['y']

print[z]

2 Answers 2

2

In your case do cumsum

df['new'] = df.x.sub(df.y).cumsum()
Out[410]: 
0  2018-09-13     5.0
1  2018-09-13    14.0
2  2018-09-13    15.0
3  2018-09-13    16.0
dtype: float64
Sign up to request clarification or add additional context in comments.

Comments

0

You can use indices

z = []
for i in range(len(df)):
    if i == 0:
        z.append(df.loc[i]['x'] - df.loc[i]['y'])
    else:
        z.append(z[i-1] + df.loc[i]['x'] - df.loc[i]['y'])

df['z'] = z

1 Comment

Awesome! This will help me to understand the concept of for loop!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.