26

I have a table that has a lot of duplicates in the Name column. I'd like to only keep one row for each.

The following lists the duplicates, but I don't know how to delete the duplicates and just keep one:

SELECT name FROM members GROUP BY name HAVING COUNT(*) > 1;

Thank you.

1

10 Answers 10

61

See the following question: Deleting duplicate rows from a table.

The adapted accepted answer from there (which is my answer, so no "theft" here...):

You can do it in a simple way assuming you have a unique ID field: you can delete all records that are the same except for the ID, but don't have "the minimum ID" for their name.

Example query:

DELETE FROM members
WHERE ID NOT IN
(
    SELECT MIN(ID)
    FROM members
    GROUP BY name
)

In case you don't have a unique index, my recommendation is to simply add an auto-incremental unique index. Mainly because it's good design, but also because it will allow you to run the query above.

Sign up to request clarification or add additional context in comments.

5 Comments

Here's how I understand the above: For each name, it groups them (only one if unique; several into one if duplicates), selects the smallest ID from the set, and then deletes any row whose ID doesn't exist in the table. Brilliant :) Thanks much Rax.
in mysql I get the following error when sending this query: "error 1093 (HY000) but it gives an error 'You cant specify target table 'members' for update in FROM clause" any ideas?
the problem was that 'members' was both the field and table name. this is what worked: delete from members where id not in (select min(id) from (select * from members) as x group by name)
Can we do DELETE FROM members WHERE ID NOT IN ( SELECT name FROM members GROUP BY name HAVING COUNT(*) > 1; )
ROWID (instead of name) can be used to differentiate rows if they are completely identical.
4

It would probably be easier to select the unique ones into a new table, drop the old table, then rename the temp table to replace it.

#create a table with same schema as members
CREATE TABLE tmp (...);

#insert the unique records
INSERT INTO tmp SELECT * FROM members GROUP BY name;

#swap it in
RENAME TABLE members TO members_old, tmp TO members;

#drop the old one
DROP TABLE members_old;

2 Comments

Thanks Paul. For those interested... CREATE TEMP TABLE tmp_members (name VARCHAR); INSERT INTO tmp_members SELECT name FROM members GROUP BY name; SELECT COUNT(name) FROM tmp_members; DELETE FROM members; VACUUM members; SELECT COUNT(name) FROM members; INSERT INTO members (name) SELECT * FROM tmp_members; SELECT COUNT(name) FROM members; SELECT DISTINCT COUNT(name) FROM members; SELECT name FROM members LIMIT 10; DROP TABLE tmp_members;
Sorry, I missed that you were using SQLite!
1

DELETE FROM tablename WHERE ID IN( SELECT MAX(ID) ID FROM tablename GROUP BY IDNumber HAVING COUNT(IDNumber) > 1 )

Comments

0

We have a huge database where deleting duplicates is part of the regular maintenance process. We use DISTINCT to select the unique records then write them into a TEMPORARY TABLE. After TRUNCATE we write back the TEMPORARY data into the TABLE.

That is one way of doing it and works as a STORED PROCEDURE.

1 Comment

I have to admit Rax Olgud's answer is much-much more sophisticated and probably runs 100 times quicker! :) - I'm thinking about adopting the solution... Deserves +1!
0

If we want to see first which rows you are about to delete. Then delete them.

with MYCTE as (
    SELECT DuplicateKey1
        ,DuplicateKey2 --optional
        ,count(*) X
    FROM MyTable
    group by DuplicateKey1, DuplicateKey2
    having count(*) > 1
) 
SELECT E.*
FROM MyTable E
JOIN MYCTE cte
ON E.DuplicateKey1=cte.DuplicateKey1
    AND E.DuplicateKey2=cte.DuplicateKey2
ORDER BY E.DuplicateKey1, E.DuplicateKey2, CreatedAt

Full example at http://developer.azurewebsites.net/2014/09/better-sql-group-by-find-duplicate-data/

Comments

0

You can join table with yourself by matched field and delete unmatching rows

DELETE t1 FROM table_name t1 
LEFT JOIN tablename t2 ON t1.match_field = t2.match_field
WHERE t1.id <> t2.id;

Comments

0

delete dup row keep one table has duplicate rows and may be some rows have no duplicate rows then it keep one rows if have duplicate or single in a table. table has two column id and name if we have to remove duplicate name from table and keep one. Its Work Fine at My end You have to Use this query.

DELETE FROM tablename
WHERE id NOT IN(

 SELECT id FROM
(
    SELECT MIN(id)AS id
    FROM tablename
    GROUP BY name HAVING 
    COUNT(*) > 1
)AS a )
AND id NOT IN(
(SELECT ids FROM
(
SELECT MIN(id)AS ids
    FROM tablename
    GROUP BY name HAVING 
    COUNT(*) =1
)AS a1
)
)

before delete table is below see the screenshot: enter image description here after delete table is below see the screenshot this query delete amit and akhil duplicate rows and keep one record (amit and akhil):

enter image description here

Comments

0

if you want to remove duplicate record from table.

CREATE TABLE tmp SELECT lastname, firstname, sex
FROM user_tbl;
GROUP BY (lastname, firstname);

DROP TABLE user_tbl;

ALTER TABLE tmp RENAME TO user_tbl;

Comments

0

show record

SELECT `page_url`,count(*) FROM wl_meta_tags GROUP BY page_url HAVING count(*) > 1

delete record

DELETE FROM wl_meta_tags 
WHERE meta_id NOT IN( SELECT meta_id 
FROM ( SELECT MIN(meta_id)AS meta_id FROM wl_meta_tags GROUP BY page_url HAVING COUNT(*) > 1 )AS a ) 
AND meta_id NOT IN( (SELECT ids FROM (
SELECT MIN(meta_id)AS ids FROM wl_meta_tags GROUP BY page_url HAVING COUNT(*) =1 )AS a1 ) )

Source url

Comments

-1
WITH CTE AS
(
    SELECT ROW_NUMBER() OVER (PARTITION BY [emp_id] ORDER BY [emp_id]) AS Row, * FROM employee_salary
)


DELETE FROM CTE
WHERE ROW <> 1

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.