Seaborn kdeplot not plotting some data?

Question

I'm trying to get the Seaborn kdeplot example to work on my dataset. For some reason, one of my datasets isn't plotting at all, but the other seems to be plotting fine. To get a minimal working example, I have sampled only 10 rows from my very large data sets.

My input data looks like this:

#Dataframe dfA
    index   x       y     category
0   595700  5   1.000000    14.0
1   293559  4   1.000000    14.0
2   562295  3   0.000000    14.0
3   219426  4   1.000000    14.0
4   592731  2   1.000000    14.0
5   178573  3   1.000000    14.0
6   553156  4   0.500000    14.0
7   385031  1   1.000000    14.0
8   391681  3   0.999998    14.0
9   492771  2   1.000000    14.0

# Dataframe dfB
    index   x      y      category
0   56345   3   1.000000    6.0
1   383741  4   1.000000    6.0
2   103044  2   1.000000    6.0
3   297357  5   1.000000    6.0
4   257508  3   1.000000    6.0
5   223600  2   0.999938    6.0
6   44530   2   1.000000    6.0
7   82925   3   1.000000    6.0
8   169592  3   0.500000    6.0
9   229482  4   0.285714    6.0

My code snippet looks like this:

import seaborn as sns
import matplotlib.pyplot as plt

sns.set(style="darkgrid")

# Set up the figure
f, ax = plt.subplots(figsize=(8, 8))

# Draw the two density plots
ax = sns.kdeplot(dfA.x, dfA.y,
             cmap="Reds", shade=True, shade_lowest=False)
ax = sns.kdeplot(dfB.x, dfB.y,
             cmap="Blues", shade=True, shade_lowest=False)

Why isn't the data from dataframe dfA actually plotting?

Are you only creating one axes-object and plot both into the same (or even plotting figure-oriented without some axes)? What about f, axarr = plt.subplots(2) + sns.kdeplot(dfA.x, dfA.y, cmap="Reds", shade=True, shade_lowest=False, ax=axarr[0]) + sns.kdeplot(dfB.x, dfB.y, cmap="Blues", shade=True, shade_lowest=False, ax=axarr[1]) — sascha
– sascha, Commented Aug 24, 2016 at 1:56
I'm trying to plot both on the same axis. But dfA doesn't plot even if I comment out the second plot comments — Joe
– Joe, Commented Aug 24, 2016 at 4:57

mwaskom · Accepted Answer · 2016-08-24 14:27:48Z

3

I don't think gaussian KDE is a good fit for either of your datasets. You have one variable with discrete values and one variable where the large majority of values seem to be a constant. This is not well modeled by a bivariate gaussian distribution.

As for what exactly is happening, without the full dataset I cannot say for sure, but I expect that the KDE bandwidth (particularly on the y axis) is ending up very very narrow such that regions with non-negligible density are tiny. You could try setting a wider bandwidth, but my advice would be to use a different kind of plot for this data.

answered Aug 24, 2016 at 14:27

mwaskom

49.3k16 gold badges135 silver badges137 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Seaborn kdeplot not plotting some data?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related