0

I have a data "last_3_days" like this:

enter image description here

Then I groupby ['user_id','sku_id','type'] and count the number of each 'type' of each ['user_id','sku_id'] pair.

How can I assign the groupby results to each ['user_id','sku_id'] pair? For each ['user_id','sku_id'] pair, they should have these additional columns:['type1_count','type2_count',...,'type6_count'].
Each column means the count of that 'type'. There are 6 types in the 'type' column.

Update: @gereleth's answer is what I want. But the result is like this:

enter image description here

How to change the above to this?

enter image description here

1
  • It's considered bad form to put pictures up on posts. In order to facilitate helping, we like to have easy access to producing a working set of data. I can't copy paste your image. That said, we could care less if you post an image if you also have code or text to produce the data. Commented Mar 26, 2017 at 23:14

1 Answer 1

1

I think you want to use unstack on the groupby results.

df = (last_౩_days.groupby('user_id', 'sku_id','type')
        .size().unstack(fill_value=0)
        .add_prefix('type').add_suffix('count'))

Unstack will turn the last level of index into columns. fill_value is the value to use for missing combinations. The names of new columns will be the unique values of type, so the last line renames them into the format you want.

Sign up to request clarification or add additional context in comments.

2 Comments

This is exactly what I want. But the result is a little different from what I want. I have a update on the question description. Please take a look. Thanks a lot!
Just add a .reset_index() at the end.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.