Pandas Dataframe Transformations

Question

Consider a data frame which looks like:

             A        B        C
0   2018-10-13      100       50
1   2018-10-13      200       25
2   2018-10-13      300       10
3   2018-10-13      400        5
4   2018-10-13      500        0
5   2018-10-14      100      100
6   2018-10-14      200       50
7   2018-10-14      300       25
8   2018-10-14      400       10
9   2018-10-14      500        5
10  2018-10-15      100      150
11  2018-10-15      200      100
12  2018-10-15      300       50
13  2018-10-15      400       25
14  2018-10-15      500       10

Here transformation that I want to perform is:

GroupBy Column A
Then GroupBy Column B into 3 intervals ( [0,100] say intval-1, [101,200] say intval-2, [201,end] say intval-3]. Can be n intervals to generalize.
Perform sum aggregation on Column C

So my transformed/pivoted dataframe should be like

             A  intval-1  intval-2  intval-3
0   2018-10-13        50        25        15
1   2018-10-14       100        50        40
2   2018-10-13       150       100        85

An easy way to implement this would be great help.

Thank You.

pivot won't work because you can't supply an aggregation function. — jpp
– jpp, Commented Oct 30, 2018 at 15:59

jpp · Accepted Answer · 2018-10-30 16:04:49Z

3

You can cut, then pivot_table:

bin_lst = [0, 100, 200, np.inf]

cut_b = pd.cut(df['B'], bins=bin_lst,
               labels=[f'intval-{i}' for i in range(1, len(bin_lst))])

res = df.assign(B=cut_b)\
        .pivot_table(index='A', columns='B', values='C', aggfunc='sum')

print(res)

B           intval-1  intval-2  intval-3
A                                       
2018-10-13        50        25        15
2018-10-14       100        50        40
2018-10-15       150       100        85

edited Oct 30, 2018 at 16:04

answered Oct 30, 2018 at 15:56

jpp

166k37 gold badges301 silver badges363 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

jpp Over a year ago

@Wen, Yep, I thought I'd go the "extra step" to type out labels :). But think better to derive from length.

BENY · Accepted Answer · 2018-10-30 15:57:16Z

3

Using pd.cut with groupby + unstack

df.B=pd.cut(df.B,bins=[0,100,200,np.inf],labels=['intval-1','intval-2','intval-3'])
df.groupby(['A','B']).C.sum().unstack()
Out[35]: 
B           intval-1  intval-2  intval-3
A                                       
2018-10-13        50        25        15
2018-10-14       100        50        40
2018-10-15       150       100        85

answered Oct 30, 2018 at 15:57

BENY

324k22 gold badges176 silver badges250 bronze badges

Collectives™ on Stack Overflow

Pandas Dataframe Transformations

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related