1

Given such a data frame:

dt       val
02-09     0.1
02-09     0.2
02-09     0.15
02-10     0.3
02-10     -0.1
...

I want to use the boxplot to show the medium, variance of val in each dt:

 ggplot(data = df,aes(y=val,x=dt)) + geom_boxplot()

But what i got is : enter image description here

It can observed that there is just one box. When I tried outlier.colour = "red", all the points are red. Why? All the values are in the interval of (-1,1)

3
  • 1
    I see a box for each category. It's hard to see, but apparently you have many identical values (such as 0)? Commented Feb 25, 2016 at 15:43
  • @Roland really? But why are they so flat? Commented Feb 25, 2016 at 15:44
  • @Roland And they are apparent not correct. Commented Feb 25, 2016 at 15:45

1 Answer 1

5

This should explain the problem:

set.seed(42)
x <- rnorm(10)
x <- c(x, rep(0, 100)) #add 100 zero values
boxplot(x)

resulting plot

quantile(x, c(0.25, 0.5, 0.75))
#25% 50% 75% 
#  0   0   0

If you have many (almost) identical values, the quartiles are (almost) identical.

Sign up to request clarification or add additional context in comments.

4 Comments

Solved. I will accept your answer in 3 minutes. Thanks.
Hi, it still regards some of my points as outliers. Do you know how to coverage all the points in the box? Thanks.
You need to have a look at how a boxplot is defined again. The box covers the quartiles and can't cover all your points.
Sorry , I mean the line of the box, which points to the maximum and minimum value.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.