0

Due to a bug in a program I've got some semi-duplicate data in my database. I'd like to merge those records (or delete duplicates).

My data looks like this:

usertable:
(userid, username, useremail)
101, joeuser, joeuser@mycompany
102, joeuser, joeuser@mycompany

datatable: 
(userid, datasubmitted)
101, mysubmittedata
102, othersubmitteddata

I would like to get rid of any duplicate id's and merge any records for either id into a single userid.

When complete I'd like for the data to look like this:

usertable:
(userid, username, useremail)
101, joeuser, joeuser@mycompany

datatable: 
(userid, datasubmitted)
101, mysubmittedata
101, othersubmitteddata
3
  • Have you tried looking up UPDATE and DELETE at all? Commented Mar 8, 2011 at 11:43
  • Yes. My problem is how to programatically select and merge rows. There are several thousand so it would be difficult to merge them manually. Commented Mar 8, 2011 at 11:47
  • sorry didn't realize there were many duplicate ids in the usertable. Commented Mar 8, 2011 at 13:38

1 Answer 1

5
Its a two step process

1. fix your datatable first

Update datatable set userid = (select min(userid) from usertable group by username, useremail
    where username=datatable.username and useremail=datatable.useremail)



2. then remove duplicates from user table

delete from usertable u1 where userid > (select min(userid) from usertable u2 group by username, useremail
    where u1.username=u2.username and u1.useremail=u2.useremail)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.