I have searched SO and although there are many QA about conditionally removing rows none of the QA fit my problem.
I have a data.frame containing longitudinal measurements of variable x, y etc... , at various time points time, in several subjects id. Some subjects experience an event ev (denoted as 1, otherwise 0 at some time). I would like to reduce the initial data.frame to:
- 1) All rows with subjects that have not experienced an event (ok, thats easy) but also include
- 2) For the subjects that have experienced an event, all rows just prior to the event (that is all rows whith times less that the time of the event of that individual).
so that,
testdf<-data.frame(id=c(rep("A",4),rep("B",4),rep("C",4) ),
x=c(NA, NA, 1,2, 3, NA, NA, 1, 2, NA,NA, 5),
y=rev(c(NA, NA, 1,2, 3, NA, NA, 1, 2, NA,NA, 5)),
time=c(1,2,3,4,0.1,0.5,10,20,3,2,1,0.5),
ev=c(0,0,0,0,0,1,0,0,0,0,0,1))
would reduce to
id x y time ev
1 A NA 5 1.0 0
2 A NA NA 2.0 0
3 A 1 NA 3.0 0
4 A 2 2 4.0 0
5 B 3 1 0.1 0
6 C 2 2 3.0 0
7 C NA 1 2.0 0
8 C NA NA 1.0 0