I have a dataframe that I have pivotted and looks like this. Basically, this just shows the number of item count per day.
MY OUTPUT
date item1 item4 item6
12/12/17 10 1 13
12/13/18 22 5 32
12/14/19 9 3 22
but the final output requested from me is to show all the items, even if there were results for that day or not, it should show on the table.
EXPECTED OUTPUT
date item1 item2 item3 item4 item5 item6
12/12/17 10 1 13
12/13/18 22 5 32
12/14/19 9 3 22
is there a way with pandas to allow me to predefine the headers? which will then match to my actual results?
what I tried doing was to create a separate mysql table, then query and transform to dataframe that table which basically contains the list of items and the sequence. And then I left merged the item list with the actual data. Now I have a table with the actual data and the item list. But when I try to pivot, only the columns with values are seen in the pivot.
SAMPLE SOURCE DATA
item date serial_no
item1 12/12/17 001
item1 12/12/17 002
item4 12/13/17 003
item6 12/14/17 004
item4 12/13/17 005
item6 12/14/17 006
item1 12/12/17 007
item1 12/14/17 008
and how I pivot is by:
pivot_df = df.pivot_table(
index = ['date'],
values = [serial_no],
columns = ['items'],
aggfunc = [len]
)