1

I have a pandas dataframe, which the following command works on:

house.groupby(['place_name'])['index_nsa'].agg(['first','last'])

It gives me what I want. Now I want to make a custom aggregation value that gives me the percentage change between the first and the last value.

I got an error for doing math on the values, so I assumed that I have to turn them into numbers.

house.groupby(['place_name'])['index_nsa'].agg({"change in %":[(int('last')-int('first')/int('first')]})

Unfortunately, I only get a syntax error on the last bracket, which I cannot seem to find the error.

Does someone see where I went wrong ?

1 Answer 1

2

You will need to define and pass a callback to agg here. You can do that in-line with a lambda function:

house.groupby(['place_name'])['index_nsa'].agg([
    ("change in %", lambda x: (x.iloc[-1] - x.iloc[0]) / x.iloc[0])])

Look closely at .agg call—to allow renaming the output column, you must pass a list of tuples of the format [(new_name, agg_func), ...]. More info here.

If you want to avoid the lambda at the cost of some verbosity, you may use

def first_last_pct(ser):
    first, last = ser.iloc[0], ser.iloc[-1]
    return (last - first) / first

house.groupby(['place_name'])['index_nsa'].agg([("change in %", first_last_pct)])
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.