I've added some data to my database and I just found out that I've got a lot of duplicates, with different key of course, and I want to merge them into a single record.
I'd like to do it within the sql database itself, I don't want to truncate the table and insert the values again, without duplicates, because the script is quite slow.
Here's a sample of my scenario:
Table track:
key | artist | title
----|-----------|--------
k1 | artist1 | title1
----|-----------|--------
k2 | artist1 | title1
----|-----------|--------
k3 | artist1 | title1
Table chart:
trackKey | otherKey | anotherKey | value
---------|----------|--------------|---------
k1 | ok1 | ak4 | v1
---------|----------|--------------|---------
k3 | ok2 | ak2 | v2
---------|----------|--------------|---------
k1 | ok3 | ak9 | v2
---------|----------|--------------|---------
k2 | ok4 | ak1 | v6
where chart.trackKey references track.key
The result that I'd like to achieve is:
Table track:
key | artist | title
----|-----------|--------
k1 | artist1 | title1
Table chart:
trackKey | otherKey | anotherKey | value
---------|----------|--------------|---------
k1 | ok1 | ak4 | v1
---------|----------|--------------|---------
k1 | ok2 | ak2 | v2
---------|----------|--------------|---------
k1 | ok3 | ak9 | v2
---------|----------|--------------|---------
k1 | ok4 | ak1 | v6
so that each duplicate of the same entry in track is merged into one row and the old keys in chart are updated with the only one that remained in the track table.
Is there any way to do this in SQL?
EDIT:
Solution #1 based on @popovitsj's answer
UPDATE chart c SET trackUri =
(WITH track_unique AS
(
SELECT MIN(uri) AS key, artist, title, album. artwork FROM track
GROUP BY artist, title
)
SELECT tu.key FROM chart c1
INNER JOIN track t ON c1.trackUri = t.key
INNER JOIN track_unique tu ON t.artist = tu.artist AND t.title = tu.title
WHERE c1.trackUri = c.trackUri and c1.countryId = c.countryId and c1.date = c.date);
returns
#1064 - Syntax error near
'track_unique AS (
SELECT MIN(uri) AS key, artist, title, album. artwork FR' line 2
Solution #2 based on @juergen d's answer
update chart
join track t1 on t1.uri = chart.trackUri
left join
(
select min(uri) as key
from track
group by artist, title
) tmp_track on tmp_track.key = chart.trackUri
set trackkey = tmp_tbl.key
where chart.trackUri not in
(
select min(uri)
from track
group by artist, title
having count(*) > 1
);
returns
#1064 - Syntax error near
'key
from track
group by artist, title
) tmp_track on tmp_track.key = c' line 5
I don't know what I'm doing wrong so I'm adding the schema definitions (taken from phpMyAdmin)
