1

I have a data frame with a MultiIndex (expenditure, groupid):

                        coef    stderr    N
expenditure groupid                         
TOTEXPCQ    176      3745.124  858.1998   81
            358     -1926.703  1036.636   75
            109      239.3678   639.373  280
            769      6406.512  1823.979   96
            775      2364.655  1392.187  220

I can get the density using df['coef'].plot(kind='density'). I would like to group these densities by the outer level of the MultiIndex (expenditure), and draw the different densities for different levels of expenditure into the same plot.

How would I achieve this? Bonus: label the different expenditure graphs with the 'expenditure' value

Answer

My initial approach was to merge the different kdes by generating one ax object and passing that along, but the accepted answer inspired me to rather generate one df with the group identifiers as columns:

n = 25
df = pd.DataFrame({'expenditure' : np.random.choice(['foo','bar'], n),
                   'groupid' : np.random.choice(['one','two'], n),
                  'coef' : np.random.randn(n)})
df2 = df[['expenditure', 'coef']].pivot_table(index=df.index, columns='expenditure', values='coef')
df2.plot(kind='kde')

enter image description here

1 Answer 1

2

Wow, that ended up being much harder than I expected. Seemed easy in concept, but (yet again) concept and practice really differed.

Set up some toy data:

n = 25
df = pd.DataFrame({'expenditure' : np.random.choice(['foo','bar'], n),
                   'groupid' : np.random.choice(['one','two'], n),
                  'coef' : randn(n)})

Then group by expenditure, iterate through each expenditure, pivot the data, and plot the kde:

gExp = df.groupby('expenditure')
for exp in gExp:
    print exp[0]
    gGroupid = exp[1].groupby('groupid')
    g = exp[1][['groupid','coef']].reset_index(drop=True)
    gpt = g.pivot_table(index = g.index, columns='groupid', values='coef')
    gpt.plot(kind='kde').set_title(exp[0])
    show()

Results in:

enter image description here enter image description here

It took some trial and error to figure out the data had to be pivoted before plotting.

Sign up to request clarification or add additional context in comments.

1 Comment

This forced me to finally understand when to use pivot_table() instead of pivot()

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.