I need to write a nested loop to go through IDs annually to compare multiple variables from dataframes D1 and D2 on an if-else condition.
D1:
ID year X1
1 2000 34563
1 2001 34563
1 2002 12367
2 2010 14363
2 2011 14363
2 2012 13312
2 2013 13312
2 2014 13312
D2:
year X1 X2
2001 34563 12367
2011 14363 13312
I created X2 in D1 (X2 is the following year's X1 in D1) by duplicating column X1 and shifting it up by 1 row (this is a rough aproach as well since if for an ID and year there is no data for the following year X2 should be filled as NA, instead of X1 for the next ID in the dataframe.)
For an ID in D1, I need to loop through each year for that ID, and for a year N, if
- D1$X1 == D2$X1
- D1$X2 == D2$X2
D1$G = 1 else D1$G = 0.
If there is no data for year N+1, condition 2 is ignored.
Now I want to compare each row in D1 directly with D2. I tried an if-else statement as follows
D1$G <- ifelse(D1$X1 == D2$X1 & D1$X2 == D2$X2 & D1$year == D2$year, "1", "0")
This is what I'm ending up with, however
ID year X1 X2 G
1 1 2000 34563 34563 0
2 1 2001 34563 12367 0
3 1 2002 12367 14363 0
4 2 2010 14363 14363 0
5 2 2011 14363 13312 0
6 2 2012 13312 13312 0
7 2 2013 13312 13312 0
8 2 2014 13312 NA 0
Instead of
ID year X1 X2 G
1 1 2000 34563 34563 0
2 1 2001 34563 12367 1
3 1 2002 12367 14363 0
4 2 2010 14363 14363 0
5 2 2011 14363 13312 1
6 2 2012 13312 13312 0
7 2 2013 13312 13312 0
8 2 2014 13312 NA 0
Want to understand where I'm going wrong (or if there are simpler methods). Any help is appreciated.
Reproducible code:
D1 <- data.frame(ID = c(1, 1, 1, 2, 2, 2, 2, 2),
year = c(2000, 2001, 2002, 2010, 2011, 2012, 2013, 2014),
X1 = c(34563, 34563, 12367, 14363, 14363, 13312, 13312, 13312)
)
D2 <- data.frame(year = c(2001, 2011),
X1 = c(34563, 14363),
X2 = c(12367, 13312)
)
# creating X2 in D1
D1$X2 = D1$X1
D1$X2 <- shift(D1$X1, 1)