0

Say I have a table of news_stories, and each of these stories can be related to each other through a related_stories table.

The schema of related_stories would be this:

related_stories
----------------
id: INT
story_id: INT
related_id: INT

In the beginning, nobody added a validation to prevent multiple relations between 2 stories, so sometimes you would end up with 2 related_stories records like so:

id: 1, story_id: 3, related_id: 4

and

id: 2, story_id: 4, related_id: 3

In essence, this is a duplication.

I can now add a validation to prevent this from happening, but it doesn't change the fact that I still have thousands of duplicate records(or records that create the same relationship).

I need some way to clear out these old duplicates, only leaving a single record per relationship. This would be pretty simple if it was all based off of a single field, but since the ids can be in either field it seems tricky to me.

How can I remove duplicates of these records in MySQL? For some reason it just doesn't occur to me. Solutions for Rails would also be welcome, though I'd prefer plain MySQL.

2
  • Does related_stories have a unique id column, or is it just the 2 columns: story_id, related_id? Commented Oct 23, 2015 at 16:07
  • There is a unique ID column. I just left that out for simplicity. I will edit my question to include it for clarity. Commented Oct 23, 2015 at 16:16

2 Answers 2

1

Delete the greatest, least combination (keep 1,2, remove 2,1):

delete rel from rel join (
  select greatest(id1,id2) id1, least(id1,id2) id2
  from   rel
  group by least(id1,id2), greatest(id1,id2)
  having count(*) > 1
) d on rel.id1 = d.id1 and rel.id2 = d.id2;

You could also modify to keep a row based on the min/max id.

Sign up to request clarification or add additional context in comments.

Comments

0
hash = {}
all_stories = RelatedStories.all.map{|rs| hash[rs.id] = [rs.story_id, rs.related_id].sort}

hash.select{|_id, data| hash.has_value?(data)}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.