My code beneath works fine. But... I think there is a more efficient way of coding this. But I can't figure it out. I thought reset_index() worked well, but it doesn't in this case. So, all suggestions are welcome. Thanks in advance!
I have a large dataframe (hospital data). All data are from 2017, 2018 and 2019. The column: spoedelectief can have two values: one for emergency and one for non emergency patient. In Dutch emergency is called Spoed. So, emergency is S and non emergency is E.
From the dataframe I want to make ( to visualize the amount of emergency and non emergency each year) a new dataframe. But I'm stuck with that. Some code;
test = df_new.groupby(df_new['operatiejaar'])['spoedelectief'].value_counts().sort_index()
gives back a Pandas Series:
operatiejaar spoedelectief
2017 E 5459
S 1054
2018 E 6191
S 1029
2019 E 6160
S 1159
For visualisation in Seaborn I tried to make this a DataFrame with reset_index() but that gives an error:
ValueError: cannot insert spoedelectief, already exists
Making test a DataFrame works:
test = pd.DataFrame(test)
With this result:
But test.columns gives this:
Index(['spoedelectief'], dtype='object')
Underneath the code I used to create a DataFrame as I wanted:
test = df_new.groupby(df_new['operatiejaar'])['spoedelectief'].value_counts().sort_index()
jaar_list = []
spel_list = []
totaal = []
for index, value in test.items():
jaar_list.append(index[0])
spel_list.append(index[1])
totaal.append(value)
spel_jaar = pd.DataFrame(
{'jaar': jaar_list,
'spoedelectief': spel_list,
'totaal': totaal
})
Which gives the desired DF:
How to code this much easier / directly from the original DF? thanks!


