1

I have 3 dataframes where column names and number of rows are exactly the same in all 3 data frames. I want to plot all the columns from all three dataframes as a grouped boxplot into one image using seaborn or matplotlib. But I am having difficulties in combining and formating the data so that I can plot them as grouped box plot.

df=

          A         B         C         D  E  F  G         H         I  J
0  0.031810  0.000556  0.007798  0.000741  0  0  0  0.000180  0.002105  0
1  0.028687  0.000571  0.009356  0.000000  0  0  0  0.000183  0.001250  0
2  0.029635  0.001111  0.009121  0.000000  0  0  0  0.000194  0.001111  0
3  0.030579  0.002424  0.007672  0.000000  0  0  0  0.000194  0.001176  0
4  0.028544  0.002667  0.007973  0.000000  0  0  0  0.000179  0.001333  0
5  0.027286  0.003226  0.006881  0.000000  0  0  0  0.000196  0.001111  0
6  0.031597  0.003030  0.006695  0.000000  0  0  0  0.000180  0.002353  0
7  0.034226  0.003030  0.010804  0.000667  0  0  0  0.000179  0.003333  0
8  0.035105  0.002941  0.010176  0.000645  0  0  0  0.000364  0.003529  0
9  0.035171  0.003125  0.012666  0.001250  0  0  0  0.000612  0.005556  0 

df1 =

          A         B         C         D  E  F  G         H         I  J
0  0.034898  0.003750  0.014091  0.001290  0  0  0  0.001488  0.005333  0
1  0.042847  0.003243  0.011559  0.000625  0  0  0  0.002272  0.010769  0
2  0.046087  0.005455  0.013101  0.000588  0  0  0  0.002147  0.008750  0
3  0.042719  0.003684  0.010496  0.001333  0  0  0  0.002627  0.004444  0
4  0.042410  0.004211  0.011580  0.000645  0  0  0  0.003007  0.006250  0
5  0.044515  0.003500  0.013990  0.000000  0  0  0  0.003954  0.007000  0
6  0.046062  0.004865  0.013278  0.000714  0  0  0  0.004035  0.011111  0
7  0.043666  0.004444  0.013460  0.000625  0  0  0  0.003826  0.010000  0
8  0.039888  0.006857  0.014351  0.000690  0  0  0  0.004314  0.011474  0
9  0.048203  0.006667  0.016338  0.000741  0  0  0  0.005294  0.013603  0

df3 =

          A         B         C         D  E  F  G         H         I  J
0  0.048576  0.006471  0.020130  0.002667  0  0  0  0.005536  0.015179  0
1  0.056270  0.007179  0.021519  0.001429  0  0  0  0.005524  0.012333  0
2  0.054020  0.008235  0.024464  0.001538  0  0  0  0.005926  0.010445  0
3  0.047297  0.008649  0.026650  0.002198  0  0  0  0.005870  0.010000  0
4  0.049347  0.009412  0.022808  0.002838  0  0  0  0.006541  0.012222  0
5  0.052026  0.010000  0.019935  0.002714  0  0  0  0.005062  0.012222  0
6  0.055124  0.010625  0.022950  0.003499  0  0  0  0.005954  0.008964  0
7  0.044411  0.010909  0.019129  0.005709  0  0  0  0.005209  0.007222  0
8  0.047697  0.010270  0.017234  0.008800  0  0  0  0.004808  0.008355  0
9  0.048562  0.010857  0.020219  0.008504  0  0  0  0.005665  0.004862  0

I can do single boxplots by using the following:

g = sns.boxplot(data=df, color = 'white', fliersize=1, linewidth=2, meanline = True, showmeans=True)

But how to get all three in one figure seems a bit difficult. I see I need to re-arrange the whole data and use hue in order to get every thing from combined data frame, but how exactly should I format the data is a question. Any help?

1 Answer 1

1

You can do all in one sns.boxplot run by concatenate the dataframes and passing hue:

tmp = (pd.concat([d.assign(data=i)                       # assign adds the column `data` with values i
                    for i,d in enumerate([df,df1,df3])]  # enumerate gives you a generator of pairs (0,df), (1,df1), (2,df2)
                )
         .melt(id_vars='data')                           # melt basically turns `id_vars` columns into index, 
                                                         # and stacks other columns
      )

sns.boxplot(data=tmp, x='variable', hue='data', y='value')

Output:

enter image description here

Sign up to request clarification or add additional context in comments.

2 Comments

Hi Quang, Thanks for your answer and it works like a charm. May I also ask you to add some details on how you are approaching this just to understand your code?
@HT121 see update. You can print tmp to see the structure.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.