3

I have the head of a dataframe like this and I want to make a pivot_table.

    user_id     item_id cate_id action_type action_date
0   11482147    492681  1_11    view          15
1   12070750    457406  1_14    deep_view     15
2   12431632    527476  1_1     view          15
3   13397746    531771  1_6     deep_view     15
4   13794253    510089  1_27    deep_view     15

There are 20000+ user_id,and 37 cate_id, 5 action_type. I want to make a pivot_table like this which I do it with excel.The values in the table should be the value_count for every user_id with every cate_id. pivot_table I've try the following code.

user_cate_table = pd.pivot_table(user_cate_table2,index = ['user_id','cate_id'],columns=np.unique(train['action_type']),values='action_type',aggfunc=np.count_nonzero,fill_value=0)

And I got this message.

ValueError: Grouper and axis must be same length

the head of dataframe user_cate_table2.

    user_id     item_id cate_id action_type
0   11482147    492681  1_11    1.0
1   12070750    457406  1_14    2.0
2   12431632    527476  1_1     1.0
3   13397746    531771  1_6     2.0
4   13794253    510089  1_27    2.0
5   14378544    535335  1_6     2.0
6   1705634     535202  1_10    1.0
7   6943823     478183  1_3     2.0
8   5902475     524378  1_6     1.0
6
  • What means np.unique(train['action_type']) ? Unique values of column with another dataframe? Commented May 31, 2017 at 12:14
  • yes,the original dataframe named train.That's what I showed at the top Commented May 31, 2017 at 12:27
  • So there are 2 dataframes and need pivot_table with them? Can you add sample of second dataframe with desired output? Thank you. Commented May 31, 2017 at 12:30
  • OK.actually the dataframe user_cate_table2 is transformed from the dataframe train.I just change the different action_type to different number. Commented May 31, 2017 at 12:32
  • So you add desired output? Commented May 31, 2017 at 12:35

2 Answers 2

3

I think you need groupby + size + unstack:

df1 = df.groupby(['user_id','cate_id', 'action_type']).size().unstack(fill_value=0)
print (df1)
action_type       deep_view  view
user_id  cate_id                 
11482147 1_11             0     1
12070750 1_14             1     0
12431632 1_1              0     1
13397746 1_6              1     0
13794253 1_27             1     0

Another solution with pivot_table:

df1 = df.pivot_table(index=['user_id','cate_id'], 
                     columns='action_type', 
                     values='item_id', 
                     aggfunc=len, 
                     fill_value=0)
print (df1)
action_type       deep_view  view
user_id  cate_id                 
11482147 1_11             0     1
12070750 1_14             1     0
12431632 1_1              0     1
13397746 1_6              1     0
13794253 1_27             1     0
Sign up to request clarification or add additional context in comments.

4 Comments

yet again I arrive at the same answer as @jezrael, who posted while I compiled mine
@MaartenFabré - Thank you. I check your answer and it is a bit different, but I have no idea if OP want something like my solution or your.
I think .size uses .np.count_nonzero, so that comes down to the same, and I didn't use the fill_value. The rest is comparable
@Chunk_Ning - Thank you for patience ;)
0

you don't need to use pivot_table. You can use groupby and unstack

df.groupby(['user_id', 'cate_id', 'action_type'])['action_date'].agg(np.count_nonzero).unstack('action_type')

pivot_table works too but not you misunderstood the columns= parameter

pd.pivot_table(df,index = ['user_id','cate_id'],columns=['action_type'],aggfunc=np.count_nonzero,fill_value=0)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.