I need to delete certain records from my table that I consider "duplicates". They're not exactly duplicates as not every column value are the same. Rather, the logic is something like this:
If
col_aandcol_bhave the same value across several rows, andcol_c(which is a timestamp) is within, say, 5 minutes of each other, then delete all rows except the row with the earliest timestamp.
Example Data:
id col_a col_b col_c
1 foo bar 2016-01-01 00:00:00
2 foo bar 2016-01-01 00:00:12
3 foo bar 2016-01-01 00:00:22
4 foo bar 2016-01-05 00:00:00
5 apple banana 2016-01-01 00:00:00
6 apple banana 2016-01-05 00:00:00
In the above example, I want to delete id = 2 and id = 3. Is this possible to do in MySQL?
SELECTstatement that identifies the rows to be removed, and once you have that tested and verify that it's returning the rows you want, then convert that into a DELETE.