Some excercise is asking to "add a secondary name" to a df column when applying an aggregation function.
Given following DF:
stype = ['s1','s1','s1','s1',
's2','s2','s2','s2']
dtype = ['d1','d1','d2','d2',
'd1','d1','d2','d2']
qty = [2, 1.4, 7, 3,
4, 1, 2.0, 3]
df = pd.DataFrame({'s_type':stype,
'd_type':dtype,
'qty':qty})
When grouping by the first 2 columns and applying an agg function like this
new_df=df.groupby(['s_type','d_type'],sort = False).agg({'qty':'median'})
And do print(new_df)
I get this output, notice the qty is moved up a line (I assume this is normal behavior) because is the one column where agg function was applied
qty
s_type d_type
s1 d1 1.7
d2 5.0
s2 d1 2.5
d2 2.5
Question(s ) would it be
Is it possible to setc secondary column name under qty label/name ?, to get an output like this:
qty
s_type d_type Median
s1 d1 1.7
d2 5.0
s2 d1 2.5
d2 2.5
Also
Why qty moves up 1 line ? (I'm using Spyder and I've seen this in the Anaconda/python console as well, not sure if its like a feature or pandas normal behavior).
One Last update: in case you need to use sort_values you will have to use the tuple for the column you add a secondary index/name, pretty much in the same way when applying the aggregate function (see the answer)
new_df=new_df.sort_values([('qty','MEDIAN')])
So you get the DF sorted by the column where agg function was applied and where we add an index
qty
MEDIAN
s_type d_type
s1 d1 1.7
s2 d1 2.5
d2 2.5
s1 d2 5.0
python v3.6
pandas v0.23
Thank you.