I have a Data Frame like this(sample),
A B C D E
0 V1 B1 Clearing C1 1538884.46
1 V1 B1 CustomerPayment_Difference C1 13537679.70
2 V1 B1 Invoice C1 -15771005.81
3 V1 B1 PaymentDifference C1 0.00
4 V2 B2 Clearing C2 104457.22
5 V2 B2 Invoice C2 -400073.56
6 V2 B2 Payment C2 297856.45
7 V3 B3 Clearing C3 1989462.95
8 V3 B3 CreditMemo C3 538.95
9 V3 B3 CustomerPayment_Difference C3 2112329.00
10 V3 B3 Invoice C3 -4066485.69
11 V4 B4 Clearing C4 -123946.13
12 V4 B4 CreditMemo C4 127624.66
13 V4 B4 Accounting C4 424774.52
14 V4 B4 Invoice C4 -40446521.41
15 V4 B4 Payment C4 44441419.95
I want to reshape this data frame like below:
A B D Accounting Clearing CreditMemo CustomerPayment_Difference \
V1 B1 C1 NaN 1538884.46 NaN 13537679.7
V2 B2 C2 NaN 104457.22 NaN NaN
V3 B3 C3 NaN 1989462.95 538.95 2112329.0
V4 B4 C4 424774.52 -123946.13 127624.66 NaN
C Invoice Payment PaymentDifference
0 -15771005.81 NaN 0.0
1 -400073.56 297856.45 NaN
2 -4066485.69 NaN NaN
3 -40446521.41 44441419.95 NaN
So far I tried to get help from pivot table,
df.pivot(index='A',columns='C', values='E').reset_index()
It gives result like below:
C A Accounting Clearing CreditMemo CustomerPayment_Difference \
0 V1 NaN 1538884.46 NaN 13537679.7
1 V2 NaN 104457.22 NaN NaN
2 V3 NaN 1989462.95 538.95 2112329.0
3 V4 424774.52 -123946.13 127624.66 NaN
C Invoice Payment PaymentDifference
0 -15771005.81 NaN 0.0
1 -400073.56 297856.45 NaN
2 -4066485.69 NaN NaN
3 -40446521.41 44441419.95 NaN
In above table it leave B&C columns, I need that columns as well.
This have provided this sample data for simplicity. But in future data will be like this also,
A B C D E
0 V1 B1 Clearing C1 1538884.46
1 V1 B1 CustomerPayment_Difference C1 13537679.70
2 V1 B1 Invoice C1 -15771005.81
3 V1 B1 PaymentDifference C1 0.00
**4 V1 B2 Clearing C1 88.9
5 V1 B2 Clearing C2 79.9**
In this situation my code will throw duplicate index error.
To fix this two problems I need to specify A,B,D as index. I need a code similar to this,
df.pivot(index=['A','B','D'],columns='C', values='E').reset_index()
this code throw me an error.
How to solve this? How to provide Multiple columns as index in pandas pivot table?