1

My current dataframe looks like this:

                     a   b    c    d    e     
 in_1   |   in_2  |
--------|-----------------------------------
 car    | bmw        2   4    5    34   46
        | merc       23  4   55    64   21 
        | range      453 32   2    56   21
        | lambo      4   6    2    5    12 
        | ferrari    12  46   34   23   642
fastfood| burger     123 34   213  23   234
        | kfc        123 34   235  123  24
        | tacoBell   213 432  124   12  1

I am trying to plot a subplot for each 'in_1' in which the x-axis is the column names (a, b, c, d, e), while the y-axis is the counts (the numbers in the cells).

So the first subplot would have the title "car". The x-axis would have 'a','b', 'c', 'd', 'e'. The y-axis will have the counts for each of 'bmw', 'merc', 'range', 'lambo', 'ferrari'.

The subplots can be bar or line plots and the values of in_2 can be represented in the form of a legend.

0

2 Answers 2

1

So I guess you could do something like this:

import numpy as np
import matplotlib.pyplot as plt

ng = 5 #number of groups

bmw = ..
merc = ..
..

fig, ax = plt.subplots()

index = np.arrange(ng)
bar_width = 0.2

fbmw = ax.bar(index, bmw, bar_width, color='r', label='BMW')
fmerc = ax.bar(index + bar_width, merc, color='b', label='MERC')
...
#don't forget to increase bar_width everytime

ax.set_xlabel('Cars')
ax.set_ylabel('Whatever this is')
ax.set_xticks(index + bar_width/2)
ax.set_xticklabels(('a', 'b', 'c', 'd', 'e'))
ax.legend()

fig.tight_layout()

Since I don't know what the numbers and columns a,b,c,d,e are, I left these labels empty. Also I thought you already have the dataframes for bmw, merc etc, so I didn't import them. Hope this helps!

Sign up to request clarification or add additional context in comments.

1 Comment

I actually found something exteremely similar to what I have written. I guess thats the page I first learned about this stuff.matplotlib.org/gallery/statistics/barchart_demo.html
1

You can use a simple loop to pick up all your columns and assign them to an axis. It will also create a subplots with the number of rows determined by the unique values in in_1.

Please note that it assumes you have a mutli index df with in_1 and in_2 as index values.

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

n = len(df.columns)
rows = df.index.get_level_values('in_1').unique()
index = np.arange(n)

fig, axs = plt.subplots(len(rows),1,figsize=(15,10))
width = 0.2

colors=["#e08283", "#52b3d9", "#fde3a7", "#3fc380"]

for i in range(len(rows)):
    intdf = df[df.index.get_level_values('in_1') == rows[i]]
    offset = 0
    for j in range(len(intdf.index.get_level_values('in_2'))):
        v = intdf.iloc[j,].values
        axs[i].bar(index + offset, v, width, 
                   label=intdf.index.get_level_values('in_2')[j], color=colors[j])

        offset += width

    axs[i].set_xlabel(rows[i])
    axs[i].set_xticks(index + width)
    axs[i].set_xticklabels(tuple(df.columns))
    axs[i].legend(loc=2)


plt.show()

Here is the output.

enter image description here

As a reminder more information can be found here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.