5

Background

Let's say I have the following dataset:

import pandas as pd
import numpy as np

data = ([["Cheese", x] for x in np.random.normal(0.8, 0.03, 10)] + 
        [["Meat", x] for x in np.random.normal(0.4, 0.05, 14)] + 
        [["Bread", 0.8], ["Bread", 0.65]])

df = pd.DataFrame(data, columns=["Food", "Score"])


import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="ticks", color_codes=True)
sns.set_context("paper")
sns.catplot(x="Score", y="Food", kind="box", data=df)

which yields the following plot (or similar, depending on the generated random numbers):

Sample box plot

The reason I am going for box-plots with my actual data is that individual dots combined with the amount of categories I want to show make the plot visually way too noisy and the boxes give a nice general overview of how the data is distributed which is what I am after. However, the issue is with categories like the "Bread" category.

Question

As you can observe, seaborn produced boxes with median, quartiles etc. for all three categories. However, since the category "Bread" does only have two data-points, using a box-plot for this category is not really an appropriate representation. I would much rather have this category only as individual dots.

But when looking at the examples on the https://seaborn.pydata.org/tutorial/categorical.html, the only suggestion for combining box-plots and simple dots is to plot both for all categories which is not what I am after.

In short: How do I plot categorical data with seaborn, selecting the appropriate representation for each category?

1 Answer 1

1

Maybe try creating df for bread and not bread:

dfb = df[df['Food'].notnull() & (df['Food'] == 'Bread')]
dfnot_b = df[df['Food'].notnull() & (df['Food'] != 'Bread')]

then add another axis:

fig, ax = plt.subplots()
ax2 = ax.twinx()

try different plots:

sns.boxplot(x="Score", y="Food", data=dfnot_b, ax=ax)
sns.stripplot(x="Score", y="Food", data=dfb, ax=ax2)

plot overlay

Sign up to request clarification or add additional context in comments.

2 Comments

Nice idea with the second axis. Though now the "Bread" category isn't given any vertical space.
@NOhs In the sns.boxplot function you can set width=0.2 to make some space.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.