0

I would like to use matplotlib to visualize the following pandas dataframe like shown in the sketch.

The sketch only show what is needed in general terms - there is no need to have the exact layout like it is depicted.

How can I achieve this task using matplotlib?

import pandas as pd
df = pd.DataFrame({'a': [0, 0, 0, 0, 0 , 1, 1,], 'b': [7, 7, 3, 3, 1, 2, 3, ], 'c': [102, 102, -50, -50, 30, 10, 10], })
df
   a  b    c
0  0  7  102
1  0  7  102
2  0  3  -50
3  0  3  -50
4  0  1   30
5  1  2   10
6  1  3   10

sketch

1 Answer 1

1

Before starting on the visualisation, I would suggest re-shaping your data to make the nesting levels explicit and pre-calculate the frequencies. Something like:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
import matplotlib.gridspec as gridspec

temp_df = pd.concat([
    df.groupby(["a"])["b"].value_counts().reset_index(name="count").rename(columns={"b":"value"}).assign(level_2="b"),
    df.groupby(["a"])["c"].value_counts().reset_index(name="count").rename(columns={"c":"value"}).assign(level_2="c")
])

final_df = (temp_df
            .rename(columns={"a":"level_1"})
            [["level_1", "level_2", "value", "count"]]
            .sort_values(["level_1", "level_2"]))

The resulting dataframe will look like this:

   level_1 level_2  value  count
0        0       b      3      2
1        0       b      7      2
2        0       b      1      1
0        0       c    -50      2
1        0       c    102      2
2        0       c     30      1
3        1       b      2      1
4        1       b      3      1
3        1       c     10      2

Now to plot values and their counts in this nested way, you can use GridSpec to define the layout based on how many values fall under each nested level. I've hard-coded the values for the purposes of illustrating this toy dataset, but you'd want to handle this programmatically for your real data.

You have 9 values so the GridSpec will have 9 columns. You have 2 nested levels, so we reserve 2 bottom rows for nesting labels and add a few more rows to "host" the bar charts.

f = plt.figure(figsize=(10,4), dpi=300)

grid = gridspec.GridSpec(10, 9, figure=f)
mpl.rcParams["axes.edgecolor"] = "gainsboro"

# Use context manager to set mpl parameters for nested axs
with mpl.rc_context({"xtick.major.bottom": False, "ytick.major.left": False}):

    # Level 1 axs (label, ax)
    ax_level_1_0 = ("0", f.add_subplot(grid[9, 0:6]))
    ax_level_1_1 = ("1", f.add_subplot(grid[9, 6:]))
    level_1_axs = [ax_level_1_0, ax_level_1_1]

    # Level 2 axs (label, ax)
    ax_level_2_0b = ("B", f.add_subplot(grid[8, 0:3]))
    ax_level_2_0c = ("C", f.add_subplot(grid[8, 3:6]))
    ax_level_2_1b = ("B", f.add_subplot(grid[8, 6:8]))
    ax_level_2_1c = ("C", f.add_subplot(grid[8, 8:]))
    level_2_axs = [ax_level_2_0b, ax_level_2_0c, ax_level_2_1b, ax_level_2_1c]

# Actual count plot axs (level_1, level_2, ax)
ax_0b = (0, "b", f.add_subplot(grid[0:8, 0:3]))
ax_0b[2].set_ylabel("Frequency")

# Hide y-ticks
with mpl.rc_context({"ytick.major.left": False}):
    ax_0c = (0, "c", f.add_subplot(grid[0:8, 3:6]))
    ax_1b = (1, "b", f.add_subplot(grid[0:8, 6:8]))
    ax_1c = (1, "c", f.add_subplot(grid[0:8, 8:]))

count_axs = [ax_0b, ax_0c, ax_1b, ax_1c]

# Remove white space between subplots
plt.subplots_adjust(wspace=0, hspace=0)

# Add label text to Level 1 and 2 axs
for label, ax in level_1_axs + level_2_axs:
    ax.text(0.5, 0.5, label, horizontalalignment='center',
            verticalalignment='center', transform=ax.transAxes)

    
for l1, l2, ax in count_axs:
    y = final_df.query(f'(level_1 == {l1}) & (level_2 == "{l2}")')["count"]
    labels = final_df.query(f'(level_1 == {l1}) & (level_2 == "{l2}")')["value"]
    x = range(len(y))
    ax.bar(x, y, color="steelblue")
    ax.set_xticks(x)
    ax.set_xticklabels(labels)
    ax.tick_params(
        axis="x", direction="in", bottom=False, pad=-20,
        colors="white", labelsize=15)

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.