0

I want to make a time series plot for weekdays (i.e. excluding weekends and holidays): If I simply use ggplot with date on the x-axis and y on the y-axis the distance between a Monday and Tuesday will not be the same as the distance between Friday and Monday. There is a daily data set bellow with a date column.

df <- structure(list(PROCEDURE_DATO_DATO = structure(c(17533, 17534, 17535, 17536, 17539, 
                                                       17540, 17541, 17542, 17543, 17546, 
                                                       17547, 17548, 17549, 17550, 17553, 
                                                       17554, 17555, 17556, 17557, 17560), 
                                                     class = "Date"), 
                     Antal_akutte = c(17, 31, 22, 18, 25,
                                      26, 20, 20, 21, 19, 
                                      25, 26, 27, 14, 14, 
                                      39, 21, 23, 20, 13), 
                     Antal_besog = c(42L, 60L, 58L, 58L, 56L, 
                                     61L, 44L, 48L, 47L, 44L, 
                                     58L,60L, 58L, 45L, 38L, 
                                     73L, 49L, 50L, 53L, 40L), 
                     Andel = c(0.404761904761905, 0.516666666666667, 0.379310344827586, 
                               0.310344827586207, 0.446428571428571, 0.426229508196721, 
                               0.454545454545455, 0.416666666666667, 0.446808510638298, 
                               0.431818181818182, 0.431034482758621, 0.433333333333333, 
                               0.46551724137931, 0.311111111111111, 0.368421052631579, 
                               0.534246575342466, 0.428571428571429, 0.46, 0.377358490566038, 0.325)), 
                .Names = c("PROCEDURE_DATO_DATO", "Antal_akutte", "Antal_besog", "Andel"), 
                row.names = c(NA, -20L), class = c("tbl_df", "tbl", "data.frame"))

If I simply make a row_number then I loose the dates on the axis. How can I use the row number, but label the axis with the date column?

df %>% 
  mutate(row = row_number()) %>% 
  ggplot(aes(row, Antal_akutte)) +
  geom_line()

enter image description here If I try to create a label with scale_x_continues I get an error:

data %>% 
  mutate(row = row_number(), 
         PROCEDURE_DATO_DATO = as.character(PROCEDURE_DATO_DATO)) %>%
  ggplot(aes(row, Antal_akutte)) +
    geom_line() +
    scale_x_continuous(labels = seq.Date(as.Date("2018-01-02"), as.Date("2018-12-31"), by = "q"))

Error in f(..., self = self) : Breaks and labels are different lengths

3
  • 2
    No. A time series is continuous—every day is included. You could instead create a column of observation numbers or some other way to assign numbers continuously for just weekdays, then set labels however you want. It's unclear how you're trying to label this or build the plot, or what type of plot you want, since none of that code is included here Commented Jan 28, 2019 at 16:06
  • @ camilie: Thanks, at first I also simply made a row number and used it for my x-axis, but then I did not have the labels. Commented Jan 28, 2019 at 16:11
  • You can set labels, such as with scale_x_continuous. But again, without the code you're using, it's hard to help specifically Commented Jan 28, 2019 at 16:14

1 Answer 1

0

You can transform your data to an eXtensible time series (xts) object which makes work with time series pretty easy. Then use autoplot to plot the xts object using ggplot2:

# load libraries
library(ggplot2)
library(xts)

# create an xts object (an xts object is formed by the matrix of observations, ordered by an index of dates - in your case `df$PROCEDURE_DATO_DATO`)
df_xts <- xts(df[,-1], order.by =df$PROCEDURE_DATO_DATO)

# make the plot
autoplot(df_xts, geom="line") 

Let's plot a few observations including the first weekend from January:

> df_xts[3:6,]
           Antal_akutte Antal_besog     Andel
2018-01-04           22          58 0.3793103
2018-01-05           18          58 0.3103448
2018-01-08           25          56 0.4464286
2018-01-09           26          61 0.4262295

I'll use geom = "point" to clearly indicate the missing data points during the weekend.

autoplot(df_xts[3:6,], geom="point"):

enter image description here

Update: To plot without the weekend dates, your solution should work:

df <- df[3:6,] %>% mutate(row=row_number(), PROCEDURE_DATO_DATO=as.character(PROCEDURE_DATO_DATO))

ggplot(df, aes(row, Antal_akutte)) + geom_line() + scale_x_continuous(breaks = df$row, labels=df$PROCEDURE_DATO_DATO)

enter image description here

Sign up to request clarification or add additional context in comments.

6 Comments

@ Thanks Lumina, I was trying to make it work with xts before but I could not make pretty plots with plot() from base r. But I don't want gaps in the time axis and I would like "one" day between jan 5 and jan 8. How do people plot daily stock prices?
@ Lumina: well I guess the correct way is that the space between Friday and Monday is the actuall timespan and not like e.g. Monday and Tuesday. But is this not in contradiction that a time series is equally spaced? Or would you say that my time series is equally spaced but with missing values for the weekends?
@ Luminta: I don't understand why I should use an xts object. Why not just use df[3:6, ] %>% ggplot(aes(x=PROCEDURE_DATO_DATO)) + geom_line(aes(y=Antal_besog), color="blue")
@David: I find it cleaner to work with xts/zoo objects when time series are involved. Particularly because of how easy it is to manipulate the time index (for instance, in case you need to pre-process your raw data to remove the weekends, .indexwday would be my first choice for that). However, a time series is continuous (it includes all days). Check out this answer for a way to remove the weekends with ggplot2 (but it uses facets and this may not be what you want): stackoverflow.com/questions/14136703/…
@ Luminita: Thanks, I have heard about PerformanceAnalytics but have not looked into it yet. The way I filtered weekends is i created a wday with lubridate and then simply made a basic dplyr filter. Anyway thanks, my problem was that I thought it was "wrong" for my plot to have extra space for for the weekends. My reasoning was that when one sees a daily plot of stock prices one does not see a gap for the weekend. That is it seems that the space between Friday and Monday is the same as the space between Monday and Tuesday. Maybe I am over thinking the issue... Thanks again.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.