1
  1. I want to identify duplicates in the ID column but only when Wave==2 (in the below example only 'C' is duplicated in wave 2).

  2. I then want to select the latest duplicate based on Date and delete it from the dataframe df.

How do I do the above?

structure(list(ID = c("E", "G", "C", "B", "D", "E", "A", "D", 
"F", "F", "C", "A", "B", "C", "A"), Wave = c(2L, 1L, 1L, 2L, 
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L), Date = c("25/02/2020", 
"18/02/2020", "14/02/2020", "21/02/2020", "24/02/2020", "16/02/2020", 
"12/02/2020", "15/02/2020", "17/02/2020", "26/02/2020", "22/02/2020", 
"20/02/2020", "13/02/2020", "23/02/2020", "11/02/2020")), class = "data.frame", row.names = c(NA, 
-15L))

2 Answers 2

1

You can use slice to select latest row where Wave = 2.

library(dplyr)

df %>%
  mutate(Date = lubridate::dmy(Date)) %>%
  group_by(ID, Wave) %>%
  slice(if(first(Wave) == 2) which.max(Date) else seq_len(n()))

#   ID     Wave Date      
#   <chr> <int> <date>    
# 1 A         1 2020-02-12
# 2 A         1 2020-02-11
# 3 A         2 2020-02-20
# 4 B         1 2020-02-13
# 5 B         2 2020-02-21
# 6 C         1 2020-02-14
# 7 C         2 2020-02-23
# 8 D         1 2020-02-15
# 9 D         2 2020-02-24
#10 E         1 2020-02-16
#11 E         2 2020-02-25
#12 F         1 2020-02-17
#13 F         2 2020-02-26
#14 G         1 2020-02-18
Sign up to request clarification or add additional context in comments.

Comments

1

Here is an option with filter

library(dplyr)
library(lubridate)
df1 %>% 
    arrange(ID, Wave, dmy(Date)) %>%
    group_by(ID, Wave) %>% 
    filter((row_number() == 1 & first(Wave) == 2)|first(Wave) != 2)
# A tibble: 14 x 3
# Groups:   ID, Wave [13]
#   ID     Wave Date      
#   <chr> <int> <chr>     
# 1 A         1 11/02/2020
# 2 A         1 12/02/2020
# 3 A         2 20/02/2020
# 4 B         1 13/02/2020
# 5 B         2 21/02/2020
# 6 C         1 14/02/2020
# 7 C         2 22/02/2020
# 8 D         1 15/02/2020
# 9 D         2 24/02/2020
#10 E         1 16/02/2020
#11 E         2 25/02/2020
#12 F         1 17/02/2020
#13 F         2 26/02/2020
#14 G         1 18/02/2020

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.