2

I have a dataframe

   city   skills   priority acknowledge id_count  acknowledge_count
    ABC    XXX       High      Yes         11         2
    ABC    XXX       High       No         10         3
    ABC    XXX       Med       Yes          5         1
    ABC    YYY       Low        No          1         5

I want to group by city and skills and get total_id_count from the column id_count, divided into three seperate column from priority as high.med,low. SIMILARLY for total_acknowledge_count, take acknowledge

output required:

                  total_id_count      total_acknowledege_count
city,skills    High   Med   Low         Yes      No
ABC,XXX        22      5     0           3        3                # 22=11+10    3=(2+1)
ABC,YYY        0       0     1           0        5

I am trying different methods like pivot_table, and groupby & stack, but it seems very difficult.

Is there any way to achieve this result.?

1 Answer 1

1

You'll need to pivot separately for the total_id_count and the total_acknowledege_count here, since you have two separate column/value schemes for the aggregation:

piv1 = df.pivot_table(index=['city', 'skills'], columns='priority', 
                      values='id_count', aggfunc='sum', fill_value=0)
piv2 = df.pivot_table(index=['city', 'skills'], columns='acknowledge', 
                      values='acknowledge_count', aggfunc='sum', fill_value=0)

piv1.columns = pd.MultiIndex.from_product([['id_count'], piv1.columns])
piv2.columns = pd.MultiIndex.from_product([['acknowledge_count'], piv2.columns])

output = pd.concat([piv1, piv2], axis=1)

print(output)

                  id_count  acknowledge_count    
                High Low Med     No Yes
city skills                                       
ABC  XXX          21   0   5     3   3
     YYY           0   1   0     5   0
Sign up to request clarification or add additional context in comments.

4 Comments

Can you please provide the final dataframe's columns (.columns)
Because I'm getting output columns in this format: 'id_count,high','id_count,Low','id_count,Med','acknowledge_count,No','acknowledge_count,Yes'
You should get the shared output by running the above code @Shubham
The output columns is a MultiIndex @Shubham

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.