0

I have got a database with movies and it's links. One movie may has more links. Unfortunately some of them has the same link twice.

For example:

row1 Alien vs. Predator http://www.avplink1
row2 Alien vs. Predator http://www.avplink1
row3 Alien vs. Predator http://www.avplink2
row4 Alien vs. Predator http://www.avplink3
row5 Minions http://www.minionslink1
row6 Minions http://www.minionslink1

I would like to delete that rows which are more than one in the table, but keep one of them. So I would like this:

row1 Alien vs. Predator http://www.avplink1
row3 Alien vs. Predator http://www.avplink2
row4 Alien vs. Predator http://www.avplink3
row5 Minions http://www.minionslink1

How can I write an SQL query which delete these rows? Thanks!

EDIT:

I solved with this code:

DELETE a            
FROM links a            
JOIN (SELECT MIN(id) id, movielink
FROM links 
GROUP BY movielink) b ON a.movielink= b.movielink 
AND a.id <> b.id 

Thanks everyone the help!

1
  • Do you have a unique id in each row? Commented Jul 18, 2015 at 16:31

2 Answers 2

1

This is almost a duplicate of this question except put

delete from

instead of

select * from
Sign up to request clarification or add additional context in comments.

1 Comment

Yes, with this: SELECT * FROM links INNER JOIN (SELECT movielink FROM links GROUP BY movielink HAVING count(id) > 1) mov ON links.movielink = mov.movielink - I can list out the duplicated movie links but if I change the select * to delete, I got this error message: #1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'INNER JOIN (SELECT movielink FROM links GROUP BY movielink HAVING count(id) > 1)' at line 2
1

This is a pain without a unique id in each row (all tables should have a primary key). Probably the easiest way is to use a temporary table in that case:

create temporary table tempt as
    select distinct movie, link
    from t;

truncate table t;

insert into t(movie, link)
    select movie, link
    from tempt;

There are simpler ways if you have a unique id. After doing this, put a unique index on the table to prevent this from happening in the future:

create unique index idx_t_movie_link on t(movie, link);

Actually, I think this statement will also delete duplicate rows, but I don't recommend using index creation to delete rows.

EDIT:

If you have a unique row identifier, then you can just do:

delete t
    from t join
         (select movie, link, min(rowid) as minrowid
          from t
          group by movie, link
         ) tt
         on t.movie = tt.movie and t.link = tt.link and t.rowid <> tt.minrowid

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.