1

I am plotting multiple vectors in one graph with ggplot2. I have the following dataframe

Array 1 2 3 4 5 6 7 8 9 10
Arr1 0.1 0.1 0.1 0.2 0.2 0.2 0.7 0.7 0.4 0.7
Arr2 0.6 0.6 0.6 0.1 0.1 0.1 0.1 0.1 0.5 0.1
Arr3 0.3 0.3 0.3 0.7 0.7 0.7 0.2 0.2 0.1 0.2
Arr4 0.4 0.6 0.7 0.2 0.1 0.3 0.4 0.5 0.3 0.9
B a a a b b b a a a b
C b b b a a b b a b a

So I melt the data to plot all vectors with horizontal stack bar, which is as follows:

df2<-df %>% 
  melt(id.vars = "Array") %>%
  mutate(variable = str_extract(variable, "[0-9]+")) %>%
  mutate(value = case_when(
    value == "a" ~ 1,
    value == "b" ~ 2, 
    TRUE ~ as.numeric(value)
  )) %>%
  mutate(variable = as.numeric(variable))
  
df2 %>% 
  ggplot(aes(x = Array, y = variable, group = Array, fill = value)) +
  geom_col() + coord_flip()

But the x-axis is not proper, the image shows that same number of a and b in vector B has different sizes, also last single element has bigger size than first three. The problem in x-axis is easier to detect with vector B and C then Arr.

When you look at df2 variable only has 1 to 10, I cannot figure out how there are more than 50 points on x - axis.

Graph

2
  • I'm having trouble duplicating this. Please add the results of dput(df) to your question. Also please show what packages you are using library(tidyverse) library(reshape2) ? Commented Oct 29, 2021 at 2:20
  • Yes library(tidyverse) and library(reshape2) are used Commented Oct 29, 2021 at 8:11

1 Answer 1

0

This is because each bar is stacking the y (variables). For each Array category, the bar is stacking from 1 to 10, the total is 1+2+...+10 = 55, that's why you see the x-axis is over 50. That's also the reason that B and C have different sizes for a and b. The first blue blocks for B and C are: (a, a, a) = (1+2+3) = 6 and (b, b, b) = (1+2+3) = 6, they have the same size. The second blue blocks for B and C are: (b, b ,b) = (4+5+6) = 15 and (a, a) = (4+5) =9, they have different sizes.

If you want the x-axis to have range between 1-10 and the B and C have same sizes for a and b. Set your y (variable) to be a vector of 1.

df2<-df %>% 
  melt(id.vars = "Array") %>%
  mutate(value = case_when(
    value == "a" ~ 1,
    value == "b" ~ 2, 
    TRUE ~ as.numeric(value)
  )) %>%
  mutate(variable = 1) # change to 1

Update legend: you can use scale_fill_continuous() to customize the legend text. Using the code below, you will get legend in the picture.enter image description here

df2 %>%
  ggplot(aes(x = Array, y = variable, group = Array, fill = value)) +
  geom_col() + 
  scale_fill_continuous(breaks = c(0.1, 1, 2), labels = c("0.1", "1 (=a)", "2 (=b)"))+
  coord_flip()
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks @Xiang, could you help me to add legend of B and C as a and b rather then a number? Its fine if you split B and C into one graph and all Arr into other with 2 seperate legends.
I have updated the answer to address the legend question.
Is it possible to get distinct color instead of blue gradient for each break to properly differentiate

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.