Plotly: How to handle uneven gaps between categories in a box plot?

Question

I am trying to generate a box plot using subsets of a larger data set. When I show the plot, there are strange gaps in the data. is there a way to center each plot over the correct label. Also, can I remove the redundant labels in the legend?

fig = go.Figure()
melted_data = melted_data.sort_values(['model', 'alpha'])
for model, alpha in zip(combos['model'].to_list(), combos['alpha'].to_list()):
    data = melted_data[(melted_data.model == model) & (melted_data.alpha == alpha)]
    fig.add_trace(go.Box(
            y= data['value'],
            x = data['model'],
            marker_color=colors[alpha],
            name = alpha,
            boxmean=True,
        ))
fig.update_layout(
    showlegend=True,
    boxmode='group', # group together boxes of the different traces for each value of x
    boxgap = .1)
fig.show()

UPDATE

Here is code to reproduce the issue:

import numpy as np
import pandas as pd
import plotly.graph_objects as go
import plotly



colors = {'A':plotly.colors.qualitative.Plotly[0], 
          'B':plotly.colors.qualitative.Plotly[1], 
          'C':plotly.colors.qualitative.Plotly[2],
          'D':plotly.colors.qualitative.Plotly[3],
          'E':plotly.colors.qualitative.Plotly[4],}

models = ['modelA', 'modelA', 'modelA', 'modelA', 'modelA', 'modelB', 'modelB', 'modelC', 'modelC', 'modelB', ]
samples = ['A', 'B', 'C', 'D', 'E', 'A', 'B', 'B', 'D', 'C']
score_cols = ['score_{}'.format(x) for x in range(10)]
scores = [(np.random.normal(mu, sd, 10).tolist()) for mu, sd in zip((np.random.normal(.90, .06, 10)), [.06]*10)]
data = dict(zip(score_cols, scores))
data['model'] = models
data['sample'] = samples
df = pd.DataFrame(data)
melted_data = pd.melt(df, id_vars =['model', 'sample'], value_vars=score_cols)

fig = go.Figure()
for model, sample in zip(models, samples):
    data = melted_data[(melted_data['model'] == model) & (melted_data['sample'] == sample)]
    fig.add_trace(go.Box(
            y= data['value'],
            x = data['model'],
            marker_color=colors[sample],
            name = sample,
            boxmean=True,
        ))
fig.update_layout(
    showlegend=True,
    boxmode='group', # group together boxes of the different traces for each value of x
    boxgap = .1)
fig.show()

Have you tried setting the xaxis type as category, by adding {'type': 'category'} to the xaxis layout? Docs here — s3dev
– s3dev, Commented Sep 17, 2020 at 12:59
Please update the question to add a sample of the data; as this example is currently not reproducible. Thanks. — s3dev
– s3dev, Commented Sep 17, 2020 at 13:24

vestland · Accepted Answer · 2020-10-27 21:03:51Z

1

I couldn't quite figure out why your go.Figure turns out the way it does. But if you reshape your data from wide to long and unleash px.bar you'll get a shorter, cleaner code and arguably a much better visual result. We can talk more details later, but you'll find a complete snippet right after this plot:

Complete code:

import numpy as np
import pandas as pd
import plotly.graph_objects as go
import plotly
import plotly.express as px



colors = {'A':plotly.colors.qualitative.Plotly[0], 
          'B':plotly.colors.qualitative.Plotly[1], 
          'C':plotly.colors.qualitative.Plotly[2],
          'D':plotly.colors.qualitative.Plotly[3],
          'E':plotly.colors.qualitative.Plotly[4],}

models = ['modelA', 'modelA', 'modelA', 'modelA', 'modelA', 'modelB', 'modelB', 'modelC', 'modelC', 'modelB', ]
samples = ['A', 'B', 'C', 'D', 'E', 'A', 'B', 'B', 'D', 'C']
score_cols = ['score_{}'.format(x) for x in range(10)]
scores = [(np.random.normal(mu, sd, 10).tolist()) for mu, sd in zip((np.random.normal(.90, .06, 10)), [.06]*10)]
data = dict(zip(score_cols, scores))
data['model'] = models
data['sample'] = samples

df = pd.DataFrame(data)

df_long = pd.wide_to_long(df, stubnames='score',
                          i=['model', 'sample'], j='type',
                          sep='_', suffix='\w+').reset_index()
df_long

fig = px.box(df_long, x='model', y="score", color ='sample')
fig.show()

edited Oct 27, 2020 at 21:03

answered Sep 18, 2020 at 12:32

vestland

62.1k41 gold badges220 silver badges343 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

xygonyx Over a year ago

Thank you, this worked out fine, it's not perfect, but I think that it will suffice. Thanks again.

Collectives™ on Stack Overflow

Plotly: How to handle uneven gaps between categories in a box plot?

1 Answer 1

Complete code:

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Complete code:

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related