1

I have two data frames. One is like below:

   D_Time                   Speed_BT       Speed_GT
2016-09-12 00:15:00          23            60  
2016-09-12 00:45:00          13            48  
2016-09-12 01:30:00          13            25  

The other one is like this:

D_Time                     Speed_AA        Speed_DD
2016-09-12 00:30:00          29            17  
2016-09-12 01:00:00          46            59  
2016-09-12 01:30:00          36            51

I want to add the two data frames based on D_Time. So, it will look like the following table:

D_Time                   Speed_BT       Speed_GT    Speed_AA   Speed_DD
2016-09-12 00:15:00          23            60          NA         NA
2016-09-12 00:30:00          NA            NA          29         17
2016-09-12 00:45:00          13            48          NA         NA
2016-09-12 01:00:00          NA            NA          46         59
2016-09-12 01:15:00          NA            NA          NA         NA
2016-09-12 01:30:00          13            25          36         51 

It will be great if I can add the 5th row like the way I have added in the data frame. However, If there is no other way then its okay.

I already have tried using this command:

add <- merge(df1, df2,by = "D_Time", all=TRUE)

But, the problem is it does not add properly. Speed_AA and Speed_DD value add in different rows where the time is different.

D_Time class is "POSIXct" "POSIXt".

Can anyone help me out?

Thanks in advance.

3 Answers 3

2

You need to create the sequence for every 15 minutes first and merge that also with your data frames, i.e.

ind <- c(df1$D_Time, df2$D_Time)

df4 <- data.frame(D_Time = seq.POSIXt(min(ind), max(ind), by = '15 mins'), 
                                                             stringsAsFactors = FALSE)

Reduce(function(...)merge(..., all = TRUE), list(df1, df2, df4))

Which gives,

            D_Time Speed_BT Speed_GT Speed_AA Speed_DD
1 2016-09-12 00:15:00       23       60       NA       NA
2 2016-09-12 00:30:00       NA       NA       29       17
3 2016-09-12 00:45:00       13       48       NA       NA
4 2016-09-12 01:00:00       NA       NA       46       59
5 2016-09-12 01:15:00       NA       NA       NA       NA
6 2016-09-12 01:30:00       13       25       36       51
Sign up to request clarification or add additional context in comments.

Comments

0

Except for the 5th row, the desired output can be acheived by the following:

df <- read.table(text="D_Time,Speed_BT,Speed_GT
2016-09-12 00:15:00, 23,  60  
2016-09-12 00:45:00, 13,  48  
2016-09-12 01:30:00, 13,  25", header=TRUE, sep=",")

df2 <- read.table(text="D_Time, Speed_AA,        Speed_DD
2016-09-12 00:30:00,          29,            17  
2016-09-12 01:00:00,          46,            59  
2016-09-12 01:30:00,          36,            51
", header=TRUE, sep=",")

merge(df, df2, all=TRUE)

If you want to include the fifth row, it has to be in one of the dataframes either in df or in df2, If you initialize df as shown below and then call merge(df, df2, all=TRUE) you will have the fifth row as well.

df <- read.table(text="D_Time,Speed_BT,Speed_GT
2016-09-12 00:15:00, 23,  60  
2016-09-12 00:45:00, 13,  48  
2016-09-12 01:30:00, 13,  25
2016-09-12 01:15:00, NA, NA", header=TRUE, sep=",")

2 Comments

I have used this command as stated in the question. This command is not working for this particular problem. I am not sure, what is wrong here? Just wondering any other way to add these data frame
can you be specific about what is not working, share an error message, or part of output that is not according to your expectation so that I can alter the solution
0

Here are two data.table approaches:

Multiple right joins

This is more or less the data.table version of Sotos' answer:

library(data.table)
setDT(df1, key = "D_Time")[setDT(df2, key = "D_Time")[
  .(D_Time = seq(min(df1$D_Time, df2$D_Time),
                 max(df1$D_Time, df2$D_Time), by = "15 mins"))]]
                D_Time Speed_BT Speed_GT Speed_AA Speed_DD
1: 2016-09-12 00:15:00       23       60       NA       NA
2: 2016-09-12 00:30:00       NA       NA       29       17
3: 2016-09-12 00:45:00       13       48       NA       NA
4: 2016-09-12 01:00:00       NA       NA       46       59
5: 2016-09-12 01:15:00       NA       NA       NA       NA
6: 2016-09-12 01:30:00       13       25       36       51

Using melt() and dcast()

This approach will work also for more than two data frames to be combined. The individual data chunks are reshaped from wide to long form, combined to a large file which is then reshaped from long to wide format again. Finally, the sequence of time stamps is right joined.

rbindlist(lapply(list(df1, df2), melt, id.vars = "D_Time"))[
  , dcast(.SD, D_Time ~ variable)][
    .(seq(min(D_Time), max(D_Time), by = "15 mins")), on = "D_Time"]
                D_Time Speed_BT Speed_GT Speed_AA Speed_DD
1: 2016-09-12 00:15:00       23       60       NA       NA
2: 2016-09-12 00:30:00       NA       NA       29       17
3: 2016-09-12 00:45:00       13       48       NA       NA
4: 2016-09-12 01:00:00       NA       NA       46       59
5: 2016-09-12 01:15:00       NA       NA       NA       NA
6: 2016-09-12 01:30:00       13       25       36       51

Data

df1 <- readr::read_table(
  "   D_Time                   Speed_BT       Speed_GT
2016-09-12 00:15:00          23            60  
  2016-09-12 00:45:00          13            48  
  2016-09-12 01:30:00          13            25  ")
df2 <- readr::read_table(
  "D_Time                     Speed_AA        Speed_DD
2016-09-12 00:30:00          29            17  
  2016-09-12 01:00:00          46            59  
  2016-09-12 01:30:00          36            51")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.