SSIS-Replace the duplicate column with empty string keeping the original column

Question

Can anyone please help me with below Requirement.

I have a requirement to check if a column in a record matches with any other column i want to replace the duplicate column with empty string.

Say i have x1,x2,x3 columns. How to check if x1 matches with any of the x1,x2,x3 columns and if it matches i want to replace the duplicate column with empty string.

Won't x1 always equal x1? Could you provide a more clear example? — Mark Wojciechowicz
– Mark Wojciechowicz, Commented Feb 7, 2017 at 13:59
No it may or may not. If it matches then it will be duplicateand i have to replace that duplicate column with empty string — John
– John, Commented Feb 7, 2017 at 14:39
Perhaps I am misunderstanding. Are you trying to see if a value of a column in one record is the same as a value of several columns in a different record? — Mark Wojciechowicz
– Mark Wojciechowicz, Commented Feb 7, 2017 at 15:14
The requirement is i have 10k plus records in a file and each record has customer details. The record includes three columns for the phone number. So i want to search if the phone number exists in any other records and if found i want to replace it with empty string. — John
– John, Commented Feb 7, 2017 at 17:20
Just to be clear, you would like to check if the phone numbers in record 1 exist in any of the other phone number columns of the other 9,999 records? Or is it: in record 1, you would like to see if the phone number is duplicated across phone number columns and, if so, blank out the repeated values in phone2 or phone3? — Mark Wojciechowicz
– Mark Wojciechowicz, Commented Feb 7, 2017 at 18:42

Caroline Roy · Accepted Answer · 2017-02-07 14:07:45Z

0

Doing this is more complexe than one would expect. Here are 2 options:

Try the fuzzy lookup by duplicating the file and comparing it with itself with a high threshold. I suspect you want to check for the same record if there is a match on other columns so you will need to create an exact match on the key (go under the Columns tab and right click on the link, Edit Mappings) and do the fuzzy on the others. You can only link a field once so duplicate the columns as needed.
Do a stored proc with all the combinations and have it generate an out table with the results (you can run a stored proc using the OLE DB Command). I would probably go with that one if I am sure of the "exactness" of the data. Otherwise, go with the fuzzy.

answered Feb 7, 2017 at 14:07

Caroline Roy

12 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Gordon Bell · Accepted Answer · 2017-02-07 23:19:48Z

0

Since you only have a few columns, you could just run a set of update statements like the following:

update Contacts
set Phone2 = null
where Phone2 = Phone1

update Contacts
set Phone3 = null
where Phone3 = Phone1

update Contacts
set Phone3 = null
where Phone3 = Phone2

answered Feb 7, 2017 at 23:19

Gordon Bell

13.7k3 gold badges48 silver badges65 bronze badges

Comments

Community · Accepted Answer · 2017-05-23 10:29:33Z

0

Accomplishing this task within an SSIS dataflow would be a bit tricky, because you would be trying to compare all of the other rows in all the buffers compared to the current row.

Instead, I would recommend staging the data in a table as Gordon Bell has suggested. Then you need to determine which row wins when a duplicate is found. You might have a date column to sort it out, or you may add a row number column to the data flow in ssis and sort by how you received the data.

Here is an example of how you might find the winning row and update others with a self join: Deleting duplicate record in SQL Server

m

edited May 23, 2017 at 10:29

CommunityBot

11 silver badge

answered Feb 8, 2017 at 16:23

Mark Wojciechowicz

4,4871 gold badge20 silver badges25 bronze badges

Collectives™ on Stack Overflow

SSIS-Replace the duplicate column with empty string keeping the original column

3 Answers 3

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related