use pandas to make pivot_table but an error occur

Question

I have the head of a dataframe like this and I want to make a pivot_table.

    user_id     item_id cate_id action_type action_date
0   11482147    492681  1_11    view          15
1   12070750    457406  1_14    deep_view     15
2   12431632    527476  1_1     view          15
3   13397746    531771  1_6     deep_view     15
4   13794253    510089  1_27    deep_view     15

There are 20000+ user_id,and 37 cate_id, 5 action_type. I want to make a pivot_table like this which I do it with excel.The values in the table should be the value_count for every user_id with every cate_id. pivot_table I've try the following code.

user_cate_table = pd.pivot_table(user_cate_table2,index = ['user_id','cate_id'],columns=np.unique(train['action_type']),values='action_type',aggfunc=np.count_nonzero,fill_value=0)

And I got this message.

ValueError: Grouper and axis must be same length

the head of dataframe user_cate_table2.

    user_id     item_id cate_id action_type
0   11482147    492681  1_11    1.0
1   12070750    457406  1_14    2.0
2   12431632    527476  1_1     1.0
3   13397746    531771  1_6     2.0
4   13794253    510089  1_27    2.0
5   14378544    535335  1_6     2.0
6   1705634     535202  1_10    1.0
7   6943823     478183  1_3     2.0
8   5902475     524378  1_6     1.0

What means np.unique(train['action_type']) ? Unique values of column with another dataframe? — jezrael
– jezrael, Commented May 31, 2017 at 12:14
yes,the original dataframe named train.That's what I showed at the top — Chunk_Ning
– Chunk_Ning, Commented May 31, 2017 at 12:27
So there are 2 dataframes and need pivot_table with them? Can you add sample of second dataframe with desired output? Thank you. — jezrael
– jezrael, Commented May 31, 2017 at 12:30
OK.actually the dataframe user_cate_table2 is transformed from the dataframe train.I just change the different action_type to different number. — Chunk_Ning
– Chunk_Ning, Commented May 31, 2017 at 12:32

jezrael · Accepted Answer · 2017-05-31 13:17:23Z

3

I think you need groupby + size + unstack:

df1 = df.groupby(['user_id','cate_id', 'action_type']).size().unstack(fill_value=0)
print (df1)
action_type       deep_view  view
user_id  cate_id                 
11482147 1_11             0     1
12070750 1_14             1     0
12431632 1_1              0     1
13397746 1_6              1     0
13794253 1_27             1     0

Another solution with pivot_table:

df1 = df.pivot_table(index=['user_id','cate_id'], 
                     columns='action_type', 
                     values='item_id', 
                     aggfunc=len, 
                     fill_value=0)
print (df1)
action_type       deep_view  view
user_id  cate_id                 
11482147 1_11             0     1
12070750 1_14             1     0
12431632 1_1              0     1
13397746 1_6              1     0
13794253 1_27             1     0

edited May 31, 2017 at 13:17

answered May 31, 2017 at 13:11

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Maarten Fabré Over a year ago

yet again I arrive at the same answer as @jezrael, who posted while I compiled mine

jezrael Over a year ago

@MaartenFabré - Thank you. I check your answer and it is a bit different, but I have no idea if OP want something like my solution or your.

Maarten Fabré Over a year ago

I think .size uses .np.count_nonzero, so that comes down to the same, and I didn't use the fill_value. The rest is comparable

jezrael Over a year ago

@Chunk_Ning - Thank you for patience ;)

Maarten Fabré · Accepted Answer · 2017-05-31 13:18:03Z

0

you don't need to use pivot_table. You can use groupby and unstack

df.groupby(['user_id', 'cate_id', 'action_type'])['action_date'].agg(np.count_nonzero).unstack('action_type')

pivot_table works too but not you misunderstood the columns= parameter

pd.pivot_table(df,index = ['user_id','cate_id'],columns=['action_type'],aggfunc=np.count_nonzero,fill_value=0)

answered May 31, 2017 at 13:18

Maarten Fabré

7,0781 gold badge19 silver badges37 bronze badges

Collectives™ on Stack Overflow

use pandas to make pivot_table but an error occur

2 Answers 2

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related