Remove duplicate data from mysql table accross two columns

Question

I have a MySQL Table which looks like:
(unique_id, uid_data1, uid_data2, sorting_data1, sorting_data2)

This table is used in a tool, where bidirectional relations weren't supported until now, so the table contains data that looks like (field order according line above):
(1, 1212, 2034, 1, 1)
(2, 2034, 1212, 1, 1)
(3, 4567, 9876, 1, 0)
(4, 9876, 4567, 0, 1)

The table also contains "single-directed" relations, i.e.
(5, 5566, 8899, 1, 9)
=> no row exists for (?, 8899, 5566, 9, 1)

As the tool now supports bidirectional/symmetric relations, I would like to remove the duplicate data from the mysql table - however I'm having some trouble finding an appropriate query to do this.
In the example above I would like to delete the rows with the uids 2 and 4 (as their data is already stored in the rows 1 and 3.

First, I tried to setup a SELECT-Statement to see, which entries would be deleted.
I thought of a JOIN-Query

SELECT x.uid, x.uid_link1, x.uid_link2, y.uid_link1 as 'uid_link2', y.uid_link2 as 'uid_link1'
FROM tx_sdfilmbase_hilfstab x
INNER JOIN tx_sdfilmbase_hilfstab y ON x.uid_link1=y.uid_link2 AND x.uid_link2=y.uid_link1
WHERE ???
ORDER BY x.uid_link1, x.uid_link2

However I'm stuck at the point where I have to tell MySQL to only select "half portion" of the records.
Any suggestions on how to do this?

P.S. Deleting each single record manually in the table isn't an option, as the table contains several thousand rows ;-)

This isn't going to be syntactically accurate, but something like delete from tx_sdfilmbase_hilfstab where uid_link2 in (select uid_link1 from tx_sdfilmbase_hilfstab) might work... — Chris Thompson
– Chris Thompson, Commented Sep 22, 2012 at 19:44
but then I might be deleting rows with no double entry, like: (1122, 2233), (2233, 1122), (5566, 1122) => here both (2233, 1122) and (5566, 1122) would be deleted, although only (2233, 1122) should be deleted, as (5566, 1122) has no double entry — Stefan
– Stefan, Commented Sep 22, 2012 at 19:54

Tony Hopkinson · Accepted Answer · 2012-09-22 20:13:35Z

4

Select t.* from MyTable t
inner join MyTable tt
On t.uid_data1 = tt.uid_data2 and t.uid_data2 = tt.uid_data1 and t.unique_ID > tt.unique_ID

Should find the "second" part of the pair (records 2 and 4 in your example)

If I got it right then

Delete t from MyTable t
inner join MyTable tt
On t.uid_data1 = tt.uid_data2 and t.uid_data2 = tt.uid_data1 and t.unique_ID > tt.unique_ID

should do the job

edited Sep 22, 2012 at 20:13

answered Sep 22, 2012 at 19:55

Tony Hopkinson

20.4k3 gold badges35 silver badges40 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Stefan Over a year ago

This is even better than having a WHERE clause. Thx!

Tony Hopkinson Over a year ago

This is a problem I've had to solve once or twice myself. :( :(

Stefan Over a year ago

Hm... The SELECT-Statement works perfectly, but the DELETE doesn't seem to work - it says that there is a syntax error... However, according to the manual, JOINs should be allowed for DELETE-Statements

Tony Hopkinson Over a year ago

Oops fixed I think, haven't got MySQl Handy.

Stefan Over a year ago

Uff right, didn't see that either - maybe a bit late already/sleeping time ;-)

RomanKonz · Accepted Answer · 2012-09-22 19:44:39Z

1

So, the one Row will be

uid_link1=1,uid_link2=9

and the other one

uid_link1=9 and uid_link2=1

right?

what about

.. WHERE x.uid_link1 < y.uid_link1 ...

but this will not remove duplicates with uid_link1=uid_link2

edit: or you can use ... WHERE x.unique_id < y.unique_id

answered Sep 22, 2012 at 19:44

RomanKonz

1,0271 gold badge8 silver badges16 bronze badges

1 Comment

Stefan Over a year ago

Nice, "WHERE x.unique_id < y.unique_id" seems to do the job. thanks a lot!

Collectives™ on Stack Overflow

Remove duplicate data from mysql table accross two columns

2 Answers 2

5 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related