Matplot Lib Loop through dataframes and add subplots to figure

Question

I have a DataFrame with 7 columns of categorical information that I would like to loop through for each columns unique labels and count of rows per label, that would then be added as a bar chart subplot to my figure. I am able to create a figure with the correct amount of subplots for the figure and also the individual DataFrames with column name and counts, but I'm not sure how I can return a new subplot to the figure from each cycle in the loop. Any help of the proper process? Provided is my attempt below and error message at the loop:

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

Libraries

# Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
%matplotlib inline

Categorical DataFrame (df_cat)

df_cat = example data from .head() method

Figure and amount of subplots

plt_nrows = round(len(df_cat.columns) / 2)
plt_ncols = len(df_cat.columns) - sub_plt_rows

fig, axs = plt.subplots(plt_nrows, plt_ncols, figsize=(20,15))

Loop of DataFrames with individual columns and label counts:

for i in df_cat.columns:
    df_cat_counts = df_cat[i].value_counts().rename_axis([i]).reset_index(name='counts')
    x = df_cat_counts[i]
    y = df_cat_counts['counts']
    axs[i,0].plot(x, y)

warped · Accepted Answer · 2020-06-16 08:47:42Z

1

I am limiting myself to columns ['Age', 'Directors', 'Genres', 'Country', 'Language'], because imdb, rotten tomatoes, and netflix are something that imho is not really categorical data.

import itertools

# split?
split_dict={'Directors':',',
       'Genres':',',
       'Country':',',
       'Language':','}

columns = ['Age', 'Directors', 'Genres', 'Country', 'Language']


fig = plt.figure(figsize=(20,20))

for p, col in enumerate(columns):

    ax = fig.add_subplot(2,3,p+1)

    split = split_dict.get(col)

    if split: # split individual cells by the resp. string. unfold and flatten using chain.from_iterable
        x = pd.Series(itertools.chain.from_iterable(df_cat[col].dropna().str.split(split))).to_frame(name=col)
    else:
        x = df_cat[[col]]

    x.groupby(col).apply(len).plot(kind='bar',ax=ax)

    split=None

edited Jun 16, 2020 at 8:47

answered Jun 14, 2020 at 16:55

warped

9,6655 gold badges26 silver badges55 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

cphill Over a year ago

That is exactly what I am going for. I tried applying this logic to my code, but the output of plt.tight_layout() is <Figure size 432x288 with 0 Axes> and no figure is shown.

cphill Over a year ago

My comment above is if that command is run in a separate cell in Jupyter. When run together I get a bunch of error messages after a long time of running. All along similar lines of RuntimeWarning: Glyph 2332 missing from current font. font.set_text(s, 0.0, flags=flags)

warped Over a year ago

@cphill plt.tight_layout() is only needed to adjust the subplot spacing. The error you are getting seems to have something to do with your data. Can you post a representative sample of your data that you can reporduce this error with?

cphill Over a year ago

Yes, I actually noticed a more complete error message Tight layout not applied. tight_layout cannot make axes height small enough to accommodate all axes decorations. The dataset is hosted on Kaggle at kaggle.com/ruchi798/…

warped Over a year ago

@cphill to be honest, I would rather not register at kaggle. Can you copy+paste a subset of the data?

|

Collectives™ on Stack Overflow

Matplot Lib Loop through dataframes and add subplots to figure

1 Answer 1

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related