I have a data frame with columns
shipment_id created_at picked_at packed_at shipped_at
CSDJKH231BN 2019-02-03 2019-02-03
CSDJKH231BN 2019-02-03 2019-02-03 2019-02-04 2019-02-05
CSDJKH2KFJ3 2019-02-01 2019-02-04 2019-02-07
The data base is being uploaded to rServer via google drive which is being constantly being updated.
u1 <- "https://docs.google.com/spreadsheets/d/e/"link""
tc1 <- getURL(u1, ssl.verifypeer=FALSE)
x <- read.csv(textConnection(tc1))
If in the first update the shipment_id CSDJKH231BN was upto picked_at and in second update from google drive we get CSDJKH231BN upto shipped_at. How do i keep only the shipment_id that are upto shipped_at, but i also want to keep the shipment_id like CSDJKH2KFJ3 which are still to be processed and are not shipped yet.
Basically just to delete the duplicate entries but this code is not working for me.
df <- df[!duplicated(df), ]
Any help would be appreciated.