Representative dataset:
ID,drug_start,date_of_bloods,result
1234,NA,12-10-2010,80
1234,NA,12-10-2011,50
1234,NA,12-10-2013,10
4532,05-03-2015,01-01-2013,80
4532,05-03-2015,01-01-2014,60
4532,05-03-2015,01-01-2016,40
7894,04-09-2016,03-08-2012,40
7894,04-09-2016,03-08-2014,38
7894,04-09-2016,02-06-2015,30
7894,04-09-2016,29-10-2016,27
7894,04-09-2016,10-10-2017,26
I would like to filter this to result >20 and those rows where the date_of_bloods is < drug_start within each row. Where NA is in the drug_start column I would like them just filtered on result >20. Desired outcome would be:
ID,drug_start,date_of_bloods,result
1234,NA,12-10-2010,80
1234,NA,12-10-2011,50
4532,05-03-2015,01-01-2013,80
4532,05-03-2015,01-01-2014,60
7894,04-09-2016,03-08-2012,40
7894,04-09-2016,03-08-2014,38
7894,04-09-2016,02-06-2015,30
So far I have tried :
table <- readxl::read_excel("data.xlsx", sheet = "bloods")
table <- as.data.frame(table)
table$result <- as.numeric(table$result)
table<- subset(table, result >20)
table$date_of_bloods <- as.Date(table$date_of_bloods, format="%d/%m/%Y")
table$drug_start <- as.Date(table$drug_start, format="%d/%m/%Y")
table[!(table$drug_start >= table$date_of_bloods),]
The final line does not work and just turns all the rows where drug_start is NA to all columns reading NA and the dates not having filtered.
Your help would be greatly welcomed
dput()for data...