How to transform grouped dataframe in python

Question

I have dataframe like this

bin=[0,5,10]

code sex age
a     1   1
a     1   6
b     1   8
b     2   2
c     2   3
c     1   4

I summarized this df like

 df.groupby([df.code,df.sex,pd.cut(df.age,bin)]).size().unstack().stack().fillna(0)

I get result like below

code sex    age
a    1  (0,5] 1
a    1 (5,10] 1
a    2  (0,5] 0
a    2 (5,10] 0
b    1  (0,5] 0
b    1 (5,10] 1
b    2  (0,5] 1
b    2 (5,10] 0
c    1  (0,5] 1
c    1 (5,10] 0
c    2  (0,5] 1
c    2 (5,10] 0

I would like to transform this df to like

        1     2
        a b c  a b c
 (0,5]  1 0 1  0 1 1
(5,10]  1 0 0  0 0 0

I tried stack() or unstack() but I totally confused to transform to above dataframe. How can I transform them? some one tell me how to transform df like this process.

i can't reproduce your intermediate result based on your code. — James
– James, Commented Oct 26, 2017 at 3:23

Andy Hayden · Accepted Answer · 2017-10-26 03:38:25Z

2

You can do this with a single pivot_table:

In [11]: df
Out[11]:
  code  sex  age
0    a    1    1
1    a    1    6
2    b    1    8
3    b    2    2
4    c    2    3
5    c    1    4

In [12]: df.pivot_table(index=pd.cut(df.age, bins),
                        columns=["sex", "code"],
                        aggfunc="count",
                        fill_value=0)
Out[12]:
        age
sex       1        2
code      a  b  c  a  b  c
age
(0, 5]    1  0  1  0  1  1
(5, 10]   1  1  0  0  0  0

answered Oct 26, 2017 at 3:38

Andy Hayden

378k110 gold badges640 silver badges546 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Andy Hayden Over a year ago

Note: any time you see stack/unstack think pivot_table!

BENY · Accepted Answer · 2017-10-26 03:51:57Z

2

df.reset_index().set_index(['sex','code','age']).unstack(-1).T
Out[760]: 
sex           1        2      
code          a  b  c  a  b  c
      age                     
value (0,5]   1  0  1  0  1  1
      (5,10]  1  1  0  0  0  0

Data input :

Out[762]: 
                 value
code sex age          
a    1   (0,5]       1
         (5,10]      1
     2   (0,5]       0
         (5,10]      0
b    1   (0,5]       0
         (5,10]      1
     2   (0,5]       1
         (5,10]      0
c    1   (0,5]       1
         (5,10]      0
     2   (0,5]       1
         (5,10]      0

Or crosstab

pd.crosstab(index=pd.cut(df.age, bin),
                        columns=[df.sex, df.code])
Out[768]: 
sex      1        2   
code     a  b  c  b  c
age                   
(0, 5]   1  0  1  1  1
(5, 10]  1  1  0  0  0

edited Oct 26, 2017 at 3:51

answered Oct 26, 2017 at 3:28

BENY

324k22 gold badges176 silver badges250 bronze badges

3 Comments

Andy Hayden Over a year ago

It's all about the pivot_table (if you see stack/unstack think pivot_table)! This doesn't handle the case that the value is > 1 ... not sure if that's possible in OPs data. Edit: I take it back, that's handled in the .size() of the OPs code!

BENY Over a year ago

@AndyHayden adding crosstab method .

Andy Hayden Over a year ago

:D haha, great!

Ken Wei · Accepted Answer · 2017-10-26 03:25:38Z

1

On the dataframe you have given, do

df.set_index(['code','sex']).unstack(['code','sex'])

In the future, please give your data in a form that allows others to run themselves, e.g. the output from df.to_records() or df.to_json().

answered Oct 26, 2017 at 3:25

Ken Wei

3,1381 gold badge12 silver badges31 bronze badges

Comments

jezrael · Accepted Answer · 2017-10-26 05:13:31Z

1

You are close, only is necessary specify parameter level in unstack and last sort columns:

df = df.groupby([df.code,df.sex,pd.cut(df.age,bin)])
       .size()
       .unstack(level=[1,0])
       .sort_index(axis=1)
       .fillna(0)
print (df)
sex        1              2     
code       a    b    c    b    c
age                             
(0, 5]   1.0  0.0  1.0  1.0  1.0
(5, 10]  1.0  1.0  0.0  0.0  0.0

answered Oct 26, 2017 at 5:13

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Collectives™ on Stack Overflow

How to transform grouped dataframe in python

4 Answers 4

1 Comment

3 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

3 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related