0

I have 3 data frames, df1 = a time interval, df2 = list of IDs, df3 = list of IDs with associated date.

df1 <- structure(list(season = structure(c(2L, 1L), .Label = c("summer", 
    "winter"), class = "factor"), mindate = structure(c(1420088400, 
    1433131200), class = c("POSIXct", "POSIXt")), maxdate = structure(c(1433131140, 
    1448945940), class = c("POSIXct", "POSIXt")), diff = structure(c(150.957638888889, 
    183.040972222222), units = "days", class = "difftime")), .Names = c("season", 
    "mindate", "maxdate", "diff"), row.names = c(NA, -2L), class = "data.frame")

df2 <- structure(list(ID = c(23796, 23796, 23796)), .Names = "ID", row.names = c(NA, 
    -3L), class = "data.frame")

df3 <- structure(list(ID = c("23796", "123456", "12134"), time = structure(c(1420909920, 
1444504500, 1444504500), class = c("POSIXct", "POSIXt"), tzone = "US/Eastern")), .Names = c("ID", 
"time"), row.names = c(NA, -3L), class = "data.frame")

The code should compare if df2$ID == df3$ID. If true, and if df3$time >= df1$mindate and df3$time <= df1$maxdate, then df1$maxdate - df3$time, else df1$maxdate - df1$mindate. I tried using the ifelse function. This works when i manually specify specific cells, but this is not what i want as I have many more (uneven rows) for each of the dfs.

df1$result <- ifelse(df2[1,1] == df3[1,1] & df3[1,2] >= df1$mindate & df3[1,2] <= df1$maxdate, 
                     difftime(df1$maxdate,df3[1,2],units="days"),
                     difftime(df1$maxdate,df1$mindate,units="days")

EDIT: The desired output is (when removing last row of df2):

 season    mindate             maxdate          diff   result
1 winter 2015-01-01 2015-05-31 23:59:00 150.9576 days 141.9576
2 summer 2015-06-01 2015-11-30 23:59:00 183.0410 days 183.0410

Any ideas? I don't see how I could merge dfs to make them of the same length. Note that df2 can be of any row length and not affect the code. Issues arise when df1 and df3 differ in # of rows.

1
  • Could you please add your desired output? Commented Jun 28, 2018 at 18:32

1 Answer 1

0

The > and < are vectorized:

transform(df1,result=ifelse(df3$ID%in%df2$ID & df3$time>mindate & df3$time <maxdate, difftime(maxdate,df3$time),difftime(maxdate,mindate)))
  season             mindate             maxdate          diff   result
1 winter 2014-12-31 21:00:00 2015-05-31 20:59:00 150.9576 days 141.9576
2 summer 2015-05-31 21:00:00 2015-11-30 20:59:00 183.0410 days 183.0410

You can also use the between function from data.table library

library(data.table)
transform(df1,result=ifelse(df3$ID%in%df2$ID&df3$time%between%df1[2:3],
               difftime(maxdate,df3$time),difftime(maxdate,mindate)))

  season             mindate             maxdate          diff   result
1 winter 2014-12-31 21:00:00 2015-05-31 20:59:00 150.9576 days 141.9576
2 summer 2015-05-31 21:00:00 2015-11-30 20:59:00 183.0410 days 183.0410
Sign up to request clarification or add additional context in comments.

4 Comments

The solutions work for this particular example. But when I apply them (e.g., the second one) to my particular need, I get the following error/warnings returned: Error in ifelse(ind$elasmo %in% df$elasmo & ind$Date.and.time %between% : argument "no" is missing, with no default In addition: Warning messages: 1: In >=.default(x, lower) : longer object length is not a multiple of shorter object length 2: In <=.default(x, upper) : longer object length is not a multiple of shorter object length. My df1 and df3 are not of the same length. How could this be fixed?
@FlyingDutch the thing you need first is to see how the two dfs shoukd have same rows, since the logic used is that the first row of df1 goes with the first row of df3 etc.. or you need an id to be able to merge the two. Also the ifelse statement continues to the next line. Look at the very first code, you will see tgat the ifelse statement is long
However, despite being able to understand the code, i seem to struggle how to get rid of the error of different nbr of rows. The warnings are fine, correct results are still produced. Any suggestions?
@FlyingDutch Lets say for example that data1 has 10 rows and data3 has 6 rows, so how are the rows related? row 1 of data1 to which row of data3 is it mapped to? there should be a way that you map the two data frame rows. eg an id that says row 1 is mapped to row1 and even row 2 of df1 is mapped to row 2 of df3 or mapped to a different row etc. In that case then you can be able to merge the two dataframes to have same no of rows.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.