1

I have a pandas dataframe:

Reference timestamp sub_reference datatype_indicator figure
REF1 2022-09-01 10 A 23.6
REF1 2022-09-01 48 B 25.8
REF1 2022-09-02 10 A 17.4
REF1 2022-10-01 10 A 23.6
REF1 2022-10-01 48 B 25.8
REF1 2022-10-02 10 A 17.4
REF2 2022-09-01 10 A 23.6
REF2 2022-09-01 48 B 25.8
REF2 2022-09-02 10 A 17.4
REF2 2022-10-01 11 A 23.6
REF2 2022-10-01 47 B 25.8
REF2 2022-10-02 10 A 17.4
REF3 2022-09-01 10 A 23.6
REF3 2022-09-01 48 B 25.8
REF3 2022-09-02 10 A 17.4
REF3 2022-10-01 11 A 23.6
REF3 2022-10-01 47 B 25.8
REF3 2022-10-02 10 A 17.4

I need to group the data by 'Reference' and the month in 'timestamp' to produce an aggregated value of 'figure' for the reference/month..

I am trying the below code, but receive TypeError: unhashable type: 'Series'

dg = df1.groupby([
            pd.Grouper('reference'),
            pd.Grouper(df1['timestamp'].dt.month)
            ]).sum()
dg.index = dg.index.strftime('%B')
print(dg)
1
  • did the answer worked for you? Commented Nov 17, 2022 at 18:56

3 Answers 3

1

I've never used the pd.Grouper before, but I think your issue is with how it is treating the extraction of the month.

I tried it like this:

>>> # add a new column for month
>>> df1["month"] = df1["timestamp"].dt.month

>>> dg = df1.groupby(by=["Reference", "month"], as_index=False).agg({"figure":sum})
>>> dg
  Reference  month  figure
0      REF1      9    66.8
1      REF1     10    66.8
2      REF2      9    66.8
3      REF2     10    66.8
4      REF3      9    66.8
5      REF3     10    66.8
Sign up to request clarification or add additional context in comments.

Comments

0
# create a year-month from teh date
# groupby and sum figure
df['month'] = pd.to_datetime(df['timestamp']).dt.strftime('%Y-%b')
out= df.groupby(['Reference','month' ], as_index=False)['figure'].sum()

out

OR

# use assign to create month column
# group and sum figure

out= (df.assign(month=pd.to_datetime(df['timestamp']).dt.strftime('%Y-%b'))
 .groupby(['Reference','month' ], as_index=False)['figure'].sum())

out
    Reference   month   figure
0   REF1    2022-Oct    66.8
1   REF1    2022-Sep    66.8
2   REF2    2022-Oct    66.8
3   REF2    2022-Sep    66.8
4   REF3    2022-Oct    66.8
5   REF3    2022-Sep    66.8

Comments

0
grouper = pd.PeriodIndex(df['timestamp'], freq='M')
df.groupby(['Reference', grouper])['figure'].sum().reset_index()

result:

    Reference   timestamp   figure
0   REF1        2022-09     66.8
1   REF1        2022-10     66.8
2   REF2        2022-09     66.8
3   REF2        2022-10     66.8
4   REF3        2022-09     66.8
5   REF3        2022-10     66.8

if you want change to %B

grouper = pd.to_datetime(df['timestamp']).dt.strftime('%B')
df.groupby(['Reference', grouper])['figure'].sum().reset_index()

result:

    Reference   timestamp   figure
0   REF1        October     66.8
1   REF1        September   66.8
2   REF2        October     66.8
3   REF2        September   66.8
4   REF3        October     66.8
5   REF3        September   66.8

2 Comments

I tried this and the output wasn't as expected (sorry can't work out how to format the table in this reply) timestamp 0 September 1 September 2 September 1440 October 1441 October
draw your desired output by example and edit question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.