0

Some rows share same primary keys(ID) but the rest of the row maybe different. For example

ID   Age   Info
2    21    2763
2    21    6276
3    31    82756

In this case, both the first and second rows has same ID and Age, but different Info. What I want do with duplicate ID rows is to randomly keep one of them and delete the others. I have so many this kind of records in my Data Sets so I can not delete them one by one. Is there any solutions? Thanks

10
  • which is your primary key in this table? Commented Nov 13, 2015 at 16:06
  • @Adish PK is ID in this example. Thanks Commented Nov 13, 2015 at 16:08
  • How can a PK allow duplicate values? Anyway, you want to remove duplicate IDs, right? or is it a combination of ID and Age that has to be treated as duplicate? Commented Nov 13, 2015 at 16:24
  • @Adish Remove duplicate IDs is good enough for my case Thanks Commented Nov 13, 2015 at 16:28
  • Are you using MySQL, SQL Server, Oracle, ? Commented Nov 13, 2015 at 16:34

4 Answers 4

1

Try this:

DELETE t1
FROM mytable AS t1
INNER JOIN mytable AS t2 
ON t1.ID = t2.ID AND t1.Age = t2.Age AND t1.Info > t2.Info

The above should work in MySQL, SQL Server. The statement deletes all rows in a (ID, Age) slice but the one having the smallest Info value.

Note: The above works provided that Info values are unique per (ID, Age) slice.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for your answer, What are t1 and t2?
@GavinNiu They are table aliases
This will not delete rows where IDs match but Age does not. This will not delete rows where all three columns are identical.
1

With window function:

;with cte as(select *, row_number() over(partition by id order by info) rn 
             from table)
delete from cte where rn <> 1

Comments

0

I think you're looking for something like this:

delete from TableName where info not in 
(select min(info) from TableName group by ID,Age);

try the select statement first to make sure it's returning the right rows then add the delete part to it

11 Comments

Let me try! Thanks for your response!
This will only work if info is unique. A row of ID = 2, Info = 82756 would throw it off.
Correct, the assumption per the example is that Info is unique per grouped ID and Age.
For this query to work Info must be unique on table level
Yes, it works with the example, but in my real case info is not unique. I apologize that I gave a bad example..
|
0

I would have suggested a set based solution, but I could not get to take care of rows in which all 3 rows are identical. Therefore suggesting a solution that uses ROWCOUNT and a while loop. The ROWCOUNT will ensure that only 1 record is deleted at a time. The while loop is so that you don't have to do it manually one by one.

SET ROWCOUNT 1

DECLARE @ctr INT
SELECT TOP 1 @ctr = COUNT(*) FROM table GROUP BY ID HAVING COUNT(*) > 1 ORDER BY COUNT(*) desc
SELECT @ctr
WHILE @ctr > 1
BEGIN
    DELETE FROM table WHERE ID IN (SELECT ID FROM table GROUP BY ID HAVING COUNT(*) > 1)
    SELECT @ctr = NULL
    SELECT TOP 1 @ctr = COUNT(*) FROM table GROUP BY ID HAVING COUNT(*) > 1 ORDER BY COUNT(*) desc
If @Ctr IS NULL
    Break
ELSE
    Continue
END
SET ROWCOUNT 0

You can alter the order by clause in the delete statement to suit your requirement.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.