2

I would like to create a dataframe merging the dataframe ss to a vector daily_vector, with date information, through the column "ss$Date_R". I would like to keep all rows from daily_vector to know which date in the dataframe ss has no data. I have tried use the function mergehowever when I tried it the vector apears as a list of numbers and not like the date.

The column "ss$Date_R" is a character column buecause I concatenated the information of the years, months and days.

head(ss)
                         Station Variable Value     Date_R
    1    SAN VICENTE DEL PALACIO    TMAX1    90 1985-01-01
    910  SAN VICENTE DEL PALACIO    TMAX2    90 1985-01-02
    1819 SAN VICENTE DEL PALACIO    TMAX3   110 1985-01-03
    2728 SAN VICENTE DEL PALACIO    TMAX4    85 1985-01-04
    3637 SAN VICENTE DEL PALACIO    TMAX5   110 1985-01-05
    4546 SAN VICENTE DEL PALACIO    TMAX6   100 1985-01-06
str(ss)
'data.frame':   9418 obs. of  4 variables:
 $ Station : Factor w/ 3 levels "MEDINA DE RIOSECO",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ Variable: Factor w/ 31 levels "TMAX1","TMAX2",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ Value   : int  90 90 110 85 110 100 80 30 80 70 ...
 $ Date_R  : chr  "1985-01-01" "1985-01-02" "1985-01-03" "1985-01-04" ...


daily_vector <-as.factor(seq(as.Date("1985-01-01"), as.Date("2010-10-14"), by="days"))

Does someone know how I can merge these two kinds of information? Do you know a better way to know which day are absent in the dataframe ss?

Thanks in advance

3
  • I think you need a second dataframe instead of a vector with column as Date_R and then merge by Date_R Commented May 25, 2015 at 15:16
  • I have tried to create a dataframe with two variables (the same vector repeated with different colnames), however when I did it. The data is converted to numeric (1,2,3...) Commented May 25, 2015 at 15:49
  • Can you, with dput, provide a portion of your data and daily_vector? Commented May 25, 2015 at 17:03

2 Answers 2

2

If you just want to to check the dates in daily_vector not in ss$Date_R, you don't need to add a new column. Instead, you can use

ss$Date_R <- as.Date(ss$Date_R)    
daily_vector <- seq(as.Date("1985-01-01"), as.Date("2010-10-14"), by="days")
missing <- !daily_vector %in% ss$Date_R 
daily_vector[missing]

This will return the dates missing in ss$Date_R as a simple vector of dates.

Edit: To add the rows of missing dates to your dataframe, you can use merge as follows:

daily_ex <- daily_vector[1:6] # 6 total dates
ss <- data.frame(V1=rnorm(5), V2=rnorm(5),
            Date_R=c(daily_vector[c(1:4, 6)])) # 5 total rows, skipped date #5 on purpose
Date_R_all <- data.frame(Date_R = daily_ex)
merge(ss, Date_R_all, by="Date_R", all=TRUE)

The result is

1 1985-01-01 -0.2152378 -1.1546424
2 1985-01-02  0.7188043 -0.3882131
3 1985-01-03  0.9581949  1.2717832
4 1985-01-04 -0.6559881 -0.6670120
5 1985-01-05         NA         NA
6 1985-01-06 -0.6285255 -1.2645569
Sign up to request clarification or add additional context in comments.

1 Comment

It works perfectly to check which data is absent!!! But finally I need to get the dataframe. It would be possible to get the result in a dataframe with the values and the NAs? It is because I would like to interpolate new data in the NAs cells with the previous and next value. Thank you very much
1

I think the merge way is ok, but first: (a) you need to set the class of your Date_R column to "Date"; (b) your daily_vector must be a data.frame (?merge for further information). Try the follows:

ss$Date_R <- as.Date.character(ss$Date_R)
daily <-data.frame((seq(as.Date("1985-01-01"),as.Date("2010-10-14"),by="days")))
colnames(daily_vec) <- "Date_R"
merge(ss, daily_vector, all=TRUE)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.