1

I have excel data file with thousands of rows and columns. I am using python and have started using pandas dataframes to analyze data. What I want to do in column D is to calculate annual change for values in column C for each year for each ID. I can use excel to do this – if the org ID is same are that in the prior row, calculate annual change (leaving the cells highlighted in blue because that’s the first period for that particular ID). I don’t know how to do this using python. Can anyone help?Screenshot of the excel sheet I am working on

1 Answer 1

1

Assuming the dataframe is already sorted

df.groupby(‘ID’).Cash.pct_change()

However, you can speed things up with the assumption things are sorted. Because it’s not necessary to group in order to calculate percentage change from one row to next

df.Cash.pct_change().mask(
    df.ID != df.ID.shift()
)

These should produce the column values you are looking for. In order to add the column, you’ll need to assign to a column or create a new dataframe with the new column

df[‘AnnChange’] = df.groupby(‘ID’).Cash.pct_change()
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks! Would this ignore calculating the % change for the rows highlighted (the first year for the ID)? Also - do I still have to loop through the data frame? Sorry for the basic questions - this is my second data into data frames.
Why don’t you try it and see 🙂
Thanks! Both the options you proposed worked. I really appreciate your quick response. Have a good evening.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.