I was looking to add subtotals to a pandas dataframe - a question which I found to be asked here often. The answers making use of the deprecated pd.append aren't relevant anymore so I figured a more up-to-date version could be useful, not only for me, but for others as well.
The problem: Have a dataframe of the form (for example) -
df = pd.DataFrame(
{"A": ["A1", "A1", "A1", "A2", "A2", "A2", "A3", "A3", "A3"],
"B": ["B1", "B1", "B1", "B2", "B2", "B2", "B3", "B3", "B3"],
"C": ["C1", "C1", "C2", "C2", "C2", "C3", "C3", "C3", "C3"],
"D": [1,2,3,4,5,6,7,8,9]})
After a df.pivot_table(), add the subtotals for each hierarchical level (or just the highest level). Thanks for any tips, I didn't figure out a straightforward and general way to accomplish this.
EDIT:
I should probably add that I'm interested in the case of index being a list of 2 and more variables. The case for n = 1 is simple enough using margins.
The idea, after a df.pivot_table(columns="C", index = ["A", "B"], values = "D", fill_value=0, aggfunc="sum") , is to get something like
C C1 C2 C3 ....
A B
A1 B1 3 3 0
A1 B2 0 9 6
Totals 3 12 6
A2 B1 0 0 24
...
where the subtotals are over the A1 ... An levels and in the corresponding C1 ... Cn columns (for the general case), not the rows.
The sample output itself is
C C1 C2 C3
A B
A1 B1 3 3 0
A2 B2 0 9 6
A3 B3 0 0 24
pd.append. I don't know why I got it confused hereconcatis the de-facto replacement forappend, so there's a chance you could rework one of those other answers to fit the need.