4

I want to show change in job numbers within certain time period. Ideally, I'd like to use a ggplot2 geom_dotplot and then color those dots by the column that they are in for that month. One idea I have not tried yet: do I need to reformat my data using tidyr from a wide to a long format in order to plot this?

Example data

Month       Finance       Tech        Construction     Manufacturing
Jan         14,000        6,800       11,000           17,500
Feb         11,500        8,400       9,480            15,000
Mar         15,250        4,200       7,200            12,400
Apr         12,000        6,400       10,300           8,500

My current r code attempt: I know that I need to fill the dot color by a factor of industry type. Maybe I have to have the data in a long format to do so.

library(tidyverse)
g <- ggplot(dat, aes(x = Month)) +
  geom_dotplot(stackgroups = TRUE, binwidth = 1000, binpositions = "all") +
  theme_light()
g

Here's how the plot I'm trying to make could look. Ideally I'd like to bin the dots as one dot per 1000 in the column value. Is that possible?

enter image description here

Thank you for taking the time to help someone who is new to R and is studying in school. Much appreciated as always,

1 Answer 1

4

I could not get the geom_dotplot to work, the y-axis always comes out wrong. Try something like, first pivot long and we repeat the Month+category per every 1000, note this solution below rounds up:

library(dplyr)
library(tidyr)
library(ggplot2)

test = pivot_longer(dat,-Month,names_to="category") %>% 
group_by(Month,category) %>% 
summarize(bins=ceiling(value/ 1000)) %>% 
uncount(bins)

If you would prefer to round down to the nearest 1000, use floor() instead of ceiling() .

Then plot:

test$Month = factor(test$Month,levels=dat[,1])

test %>% ggplot(aes(x=Month,y=1,col=category)) + 
geom_point(position=position_stack()) + 
scale_y_continuous(labels=scales::number_format(scale=1000))

enter image description here

Sign up to request clarification or add additional context in comments.

11 Comments

Thank you so much! Looks amazing, I will try right now. @StupidWolf you save the day for me once again.
Hi @JacksonWalker, I just realized one thing.. if it is 6,800, would you have 6 or 7 dots? Right now, the code above rounds everything down.. so 6,800 will give 6. I can edit my answer to round up
Yes @Stupidwolf please do edit it to round up, thank you!
@JacksonWalker done.. see above, for example Jan, tech you get 7balls
thank you @stupidwolf, when I run test = pivot_longer(data1,-Month,names_to="category") %>% group_by(Month,category) %>% summarize(bins=ceiling(value/ 1000)) %>% uncount(bins) I get an error saying Error in rep(seq_nrow(data), w) : invalid 'times' argument. Could this be because my Month column actually has data that reads "Mar 2020" with the years?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.