0

Large SQL 2008 table with 60 000 000 records, and I have problem with duplicate rows.

This command gives me my duplicate taken from http://support.microsoft.com/kb/139444

SELECT     id, sa_trvalue,  COUNT(*) AS tot  
FROM         msanal   
GROUP BY id, sa_trvalue  
HAVING      (COUNT(*) > 1)  

But when I follow through the steps (INTO and DISTINCT) I get not enough memory to complete operation.

1
  • If you really need to create a new table with no duplicates, an easy way would be to restrict the query with Where Id >= 0 and Id < 100000 and then just page through until you've covered the entire range. To just get rid, Mr Schmelter has given you a way. Commented Oct 1, 2013 at 15:48

2 Answers 2

1

You could try this approach which might need less memory:

WITH CTE AS
(
    SELECT  id, sa_trvalue, 
            rn = ROW_NUMBER() OVER (PARTITION BY id, sa_trvalue ORDER BY id ASC)
    FROM    msanal   
)
DELETE FROM CTE WHERE rn > 1

A common table expression has also the advantage that you can modify it easily to see what you are going to delete. Therefore you just have to change DELETE to SELECT *.

Sign up to request clarification or add additional context in comments.

2 Comments

You don't really need to select id, sa_trvalue inside of CTE?
@YuriyGalanter: No, just for demonstration purposes.
0
delete msanal from msanal m1
where exists
(select null from msanal m2
where m2.sa_trvalue = m1.sa_trvalue and m2.id <> m1.id)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.