0

I have 4 dataframes

df1 = pd.DataFrame({'ID': [0, 0, 0, 0, 0, 0],
                    'value': [3.0, 3.5, 4.5, NaN, 7.0, 8.1]})

df2 = pd.DataFrame({'ID': [1, 1, 1, 1, 1, 1],
                    'value': [9.4, NaN, 4.5, 2.4, 4.0, 3.9]})

df3 = pd.DataFrame({'ID': [2, 2, 2],
                    'value': [1.0, 3.9, 4.1]})

df4 = pd.DataFrame({'ID': [3, 3, 3, 3],
                    'value': [NaN, NaN, 5.8, 3.0]})

I want to make a boxplot with values in the column value in each of the dataframe. I did the following

fig, ax2 = plt.subplots()
vec = [df1['value'].values,df2['value'].values,df3['value'].values,df4['value'].values]
labels = ['ID_0','ID_1', 'ID_2', 'ID_3']
ax2.boxplot(vec, labels = labels)
ax2.set_title('Values')
plt.show()

But it doesn't work and throws me an empty plot. Is there a better way to do this?

Traceback

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [3], in <cell line: 1>()
      1 df1 = pd.DataFrame({'ID': [0, 0, 0, 0, 0, 0],
----> 2                     'value': [3.0, 3.5, 4.5, NaN, 7.0, 8.1]})
      4 df2 = pd.DataFrame({'ID': [1, 1, 1, 1, 1, 1],
      5                     'value': [9.4, NaN, 4.5, 2.4, 4.0, 3.9]})
      7 df3 = pd.DataFrame({'ID': [2, 2, 2],
      8                     'value': [1.0, 3.9, 4.1]})

NameError: name 'NaN' is not defined
0

1 Answer 1

1

To identify NaN, you need to use np.nan (use import numpy as np if required). Also, you need to dropna() before plotting. Making the changes...

df1 = pd.DataFrame({'ID': [0, 0, 0, 0, 0, 0], 'value': [3.0, 3.5, 4.5, np.nan, 7.0, 8.1]}).dropna()
df2 = pd.DataFrame({'ID': [1, 1, 1, 1, 1, 1], 'value': [9.4, np.nan, 4.5, 2.4, 4.0, 3.9]}).dropna()
df3 = pd.DataFrame({'ID': [2, 2, 2], 'value': [1.0, 3.9, 4.1]}).dropna()
df4 = pd.DataFrame({'ID': [3, 3, 3, 3],'value': [np.nan, np.nan, 5.8, 3.0]}).dropna()
fig, ax2 = plt.subplots()
vec = [df1['value'].values,df2['value'].values,df3['value'].values,df4['value'].values]
labels = ['ID_0','ID_1', 'ID_2', 'ID_3']
ax2.boxplot(vec, labels = labels)
ax2.set_title('Values')
plt.show()

gives you...

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.