0

Sorry if this is so basic.

I am trying to use ggplot2 to draw a boxplot for my data. However, I am having troubles of melting the data into the correct format.

Here is how my data looks like:

head(Final_RMSE_MC)

  N_reg N_var RMSE_MC_1 RMSE_MC_2 RMSE_MC_3 RMSE_MC_4 RMSE_MC_5 RMSE_MC_6 RMSE_MC_7 RMSE_MC_8 RMSE_MC_9 RMSE_MC_10 RMSE_MC_11 RMSE_MC_12 RMSE_MC_13
1     6     5 0.5016800 0.5898132 0.5482860 0.4585713 0.4830320 0.4376286 0.4626646 0.5753290 0.4600453  0.4625784  0.4135086  0.5356082  0.4262005
2    10     5 0.4928764 0.4426350 0.4634775 0.4049509 0.5192989 0.4420706 0.3912822 0.4808609 0.4190173  0.4828170  0.4123871  0.4394507  0.4100748
3     4     4 0.4890946 0.4503532 0.5930480 0.5608510 0.4232696 0.5392966 0.4134308 0.5950408 0.5425955  0.5209573  0.6669176  0.4819051  0.4926042
4     5     4 0.5229090 0.5076377 0.5254299 0.4455789 0.4816532 0.5468765 0.4474718 0.4467224 0.4280381  0.6339686  0.3921858  0.5335065  0.4548194
5     9     4 0.4138625 0.4782089 0.4522069 0.4534526 0.4175361 0.4685324 0.3908619 0.4877251 0.4509520  0.4410600  0.4685804  0.4660575  0.4775753
6     3     3 0.5135749 0.6280533 0.5841148 0.5051640 0.5279784 0.5981735 0.4638461 0.4664253 0.4568787  0.4150206  0.5780827  0.5474891  0.4232878

I would like to melt the data based on N_var for all the columns except N_reg and N_var. So I tried

dfm <- melt(Final_RMSE_MC, id.vars = "N_reg")

and I got

  N_reg variable value
1     6    N_var     5
2    10    N_var     5
3     4    N_var     4
4     5    N_var     4
5     9    N_var     4
6     3    N_var     3

which does not seem right.

My next step is to use

ggplot(dfm, aes(x = variable, y = value)) + geom_boxplot()

Thanks for any suggestions!

1

1 Answer 1

1

You should use both variables N_reg and N_var as id.vars as they are the same for all other variables in one row.

dfm <- melt(Final_RMSE_MC, id.vars = c("N_reg","N_var"))

head(dfm)
  N_reg N_var  variable     value
1     6     5 RMSE_MC_1 0.5016800
2    10     5 RMSE_MC_1 0.4928764
3     4     4 RMSE_MC_1 0.4890946
4     5     4 RMSE_MC_1 0.5229090
5     9     4 RMSE_MC_1 0.4138625
6     3     3 RMSE_MC_1 0.5135749

ggplot(dfm, aes(x = variable, y = value)) + geom_boxplot()
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks. But actually N_reg and N_var are different
I ment that they are the same in one row.
I see. should I use ggplot(dfm, aes(x = N_reg, y = value)) + geom_boxplot() if I want to build boxplot for values with the same N_reg?
It is factor(N_reg). I found it. Thanks so much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.