1

This table only has 2 columns. There are rows considered "duplicate" when both of the row's columns match.

That is:

col1 col2
X    X
X    X
X    Y  -- this is not a duplicate

I want to delete the duplicates but leave at least one of them. It doesn't matter which because they are the same thing.

I've used variations of IN and JOIN, but I can't seem to get the DELETE outer query to only delete one of each row with duplicate columns.

10
  • What is your PRIMARY KEY? And which version of MySQL? Commented Feb 5, 2016 at 12:59
  • Possible duplicate of Remove duplicate rows in MySQL Commented Feb 5, 2016 at 13:00
  • There's no PRIMARY KEY. And version: 5.5.35-0ubuntu0.12.04.2 Commented Feb 5, 2016 at 13:03
  • Consider creating a new table (WITH A PRIMARY KEY) using only distinct values from the old table. Then drop the old table. Commented Feb 5, 2016 at 13:04
  • Duplicate question about duplicates: stackoverflow.com/questions/2630440/… Commented Feb 5, 2016 at 13:06

3 Answers 3

1

Try this:

DELETE  a
FROM    mytable a
        LEFT JOIN
        (
            SELECT MIN(ID) ID, col1, col2
            FROM    mytable
            GROUP   BY col1, col2
        ) b ON  a.ID = b.ID AND
                a.col1 = b.col1 AND
                a.col2 = b.col2
WHERE   b.ID IS NULL

DEMO

Assuming ID as primary key column

EDIT:

However if you dont have the ID column then you can try like this:

ALTER IGNORE TABLE mytable
  ADD UNIQUE INDEX all_columns_uq
    (col1, col2) ;

DEMO

Sign up to request clarification or add additional context in comments.

1 Comment

"This table only has 2 columns."
1
ALTER IGNORE TABLE table1 ADD UNIQUE INDEX idx_name (col1,col2);

OR

CREATE TABLE table1_temp AS
SELECT * FROM table1 GROUP BY col1, col2;

TRUNCATE TABLE table1;
INSERT INTO table1 SELECT * FROM table1_temp;

DROP TABLE table1_temp;

You may lose data with the second method on a live table though, also any other tables referencing it may not be too happy!

I'd suggest adding the unique index too, just to future proof yourself.

1 Comment

@strawberry please don't edit my answer to remove large chunks of text and make your own preferred/superficial changes.. add a comment if you want to suggest changes.
0

Here is a way by using a CTE and row_Number function

    ; WITH DuplicateRecords AS (
                                SELECT ROW_NUMBER() OVER(PARTITION BY col_1, col_2 ORDER BY col_1 ) AS RW,
                                col_1, 
                                col_2
                                FROM [TABLE]

)
DELETE T 
FROM [TABLE] AS T
    INNER JOIN DuplicateRecords AS D ON D.col_1 = T.col_1 AND D.col_2 = T.col_2 AND D.RW > 1

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.