1

I would like to plot the line right on each bar, but there is strange reason that the line is inside the bars this is overview of dataset the numbers are the moving averages

I would like to present the data visualisation which includes line chart and bar chart. Could you guys tell me what kind of mistakes I have made? I guess if it because the moving weighted average caused this issue?

rollmean_covid  <- covid_ts1  %>% filter(Nation == "England") %>% select(date, daily_pos_num) %>% 
    mutate(pos_num01 = rollmean(daily_pos_num, k = 3, fill = NA),
           pos_num02 = rollmean(daily_pos_num, k = 5, fill = NA),
           pos_num03 = rollmean(daily_pos_num, k = 7, fill = NA),
           pos_num04 = rollmean(daily_pos_num, k = 10, fill = NA),
           pos_num05 = rollmean(daily_pos_num, k = 14, fill = NA))

rollmean_covid_metric <- rollmean_covid %>% gather(metric, number, pos_num01:pos_num05)
rollmean_covid_metric %>% filter(metric == "pos_num01") %>% 
  ggplot() + 
  geom_line(aes(date, number), col = "Blue") +
  geom_col(aes(date, number), fill = "orange", alpha = 0.7)

edited1: the dataset overview is provided bellow.

# A tibble: 10 x 7
   date       daily_pos_num pos_num01 pos_num02 pos_num03 pos_num04 pos_num05
   <date>             <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>
 1 2020-01-30             2    NA          NA      NA          NA      NA    
 2 2020-01-30             0     0.667      NA      NA          NA      NA    
 3 2020-01-31             0     0           0.4    NA          NA      NA    
 4 2020-01-31             0     0           0       0.286      NA      NA    
 5 2020-02-01             0     0           0       0           0.3    NA    
 6 2020-02-01             0     0           0       0.143       0.1    NA    
 7 2020-02-02             0     0           0.2     0.143       0.1     0.286
 8 2020-02-02             0     0.333       0.2     0.143       0.2     0.143
 9 2020-02-03             1     0.333       0.2     0.143       0.2     0.143
10 2020-02-03             0     0.333       0.2     0.286       0.2     0.143
2
  • 1
    Hi James. You haven't told us what the problem is. Commented Jul 22, 2020 at 17:30
  • @AllanCameron I expect the line is on the top of bar, but I fail to plot the line over the bar Commented Jul 22, 2020 at 17:36

1 Answer 1

2

You have multiple observations for one date, and by default the position = "stack" for geom_col so that's why the bars are taller. You probably want position = "identity", this should work:

rollmean_covid_metric %>% 
    filter(metric == "pos_num01") %>% 
    ggplot(aes(date, number)) + 
    geom_line(color = "blue") +
    geom_col(position = "identity", fill = "orange", alpha = 0.7)

Edit to add: as @chemdork123 points out in the comments, if one date has multiple values this probably won't give desired results. In general, the best solution for these types of problems is to munge your data into the correct shape before piping it into a ggplot call.

Sign up to request clarification or add additional context in comments.

2 Comments

This answer +1. OP, your filter() command prior to plotting is giving you duplicate data. Check the data preparation prior to the ggplot call. Setting position="identity" will work to fix your plot, but you will still be plotting two copies of geom_col on top of one another. If they are duplicates... no huge deal, you'll just have darker bars. If they are not duplicates, your results will still look wrong. Without the original dataset, it's difficult to see where you're going wrong going from rollmean_covid to rollmean_covid_metric, but the answer lies within there.
Thanks for your help@DiceboyT @chemdork123 , this not only justify my problem but also let me more understand the logics of ggplot2

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.