1

Let's assume I have a dataframe with one column - Date - that goes from 2000 to 2019. The problem is that I don't have perfect monthly frequence (in fact I should have 245 observations, instead I only have 215). My aim is to detect what are the missing months in the column.

Let's take this example. This is a sample dataframe:

df <- data.frame(Date = c("2015-01-22", "2015-03-05", "2015-04-15", "2015-06-03", "2015-07-16", "2015-09-03", "2015-10-22", "2015-12-03", "2016-01-21", "2016-03-10", "2016-04-21", "2016-06-02", "2016-07-21", "2016-09-08", "2016-10-20", "2016-12-08", "2017-01-19", "2017-03-09", "2017-04-27", "2017-06-08", "2017-07-20", "2017-09-07", "2017-10-26", "2017-12-14", "2018-01-25", "2018-03-08", "2018-04-26", "2018-06-14", "2018-07-26", "2018-09-13", "2018-10-25", "2018-12-13", "2019-01-24", "2019-03-07", "2019-04-10", "2019-06-06", "2019-07-25", "2019-09-12", "2019-10-24", "2019-12-12"))

df

I would like to find a code that gives me what are the missing months in my column vector of dates.

Can anyone help me?

Thanks a lot

2 Answers 2

2

Here are two types of results to see the missing months, with base R:

  • If you want to see the missing month regardless of years, you can try the following code
missingMonths <- month.name[setdiff(seq(12),as.numeric(format(as.Date(df$Date),"%m")))]

such that

> missingMonths
[1] "February" "May"      "August"   "November"
  • If you want to check the missing months by year, you can try the code below:
missingMonths <- lapply(split(df,format(as.Date(df$Date),"%Y")), 
                        function(x) month.name[setdiff(seq(12),as.numeric(format(as.Date(x$Date),"%m")))])

such that

> missingMonths
$`2015`
[1] "February" "May"      "August"   "November"

$`2016`
[1] "February" "May"      "August"   "November"

$`2017`
[1] "February" "May"      "August"   "November"

$`2018`
[1] "February" "May"      "August"   "November"

$`2019`
[1] "February" "May"      "August"   "November"
Sign up to request clarification or add additional context in comments.

Comments

1

Not as succinct as above, but still does the trick in a couple of steps:

month_date_strings <- unique(paste0(sub("-[^-]+$", "", 
                           sapply(df$Date, as.character)), "-01"))


month_seq_strings <- unique(as.character(seq.Date(as.Date("2000-01-01", "%Y-%m-%d"),
                      as.Date("2019-12-31", "%Y-%m-%d"), by = "month")))

month_seq_strings[!(month_seq_strings %in% month_date_strings)]

1 Comment

Very nice as well. Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.