5

I have a complex nested replace which I am using to join two tables in MSSQL.

select * from A
  left outer join 
select * from B
on
  replace(
     replace(
        replace(
           replace(
              replace(A.Column1, '1114', ''),
             '1160', ''), 
          '1162', ''),
        '1167', ''),
      '1176', ''),
    '1177', '')  = B.Column1

The whole reason I am doing this is because data in Table1 contains of some noise - numbers like 1160, 1162 etc wheres Table2 is clean characters.

Eg. - Table 1 - 'HELLO1160WORLD'
      Table 2 - 'HELLOWORLD'

Now in my situation I should be able to match them as one entry.

My current approach of nested replace does work but I am not convinced that this is an elegant way to do this. Any help will be much appreciated. Thanks

7
  • I hope that you are using SELECT * for illustrative purposes and not as practice. Commented Oct 26, 2012 at 14:34
  • can you post some sample data and then show what you want the final result to be? Commented Oct 26, 2012 at 14:34
  • Sure. What do you need help with? Commented Oct 26, 2012 at 14:34
  • Yes I am not actually using * - just for making things simpler. Ill edit the original question to make it clearer Commented Oct 26, 2012 at 14:37
  • 1
    is 'HELLO1160WORLD' a valid piece of data? if yes, bummer. if no, why not sanitize on the front end? Commented Oct 26, 2012 at 15:51

3 Answers 3

4

The problem is that T-SQL does not easily allow to mark expressions with a name so you can refer to them from a different place. There is a way to do this though:

select replaceN
from T
cross apply (select replace1 = replace(T.col, 'x', 'y')) r1
cross apply (select replace2 = replace(replace1, 'x', 'y')) r2
cross apply (select replace3 = replace(replace2, 'x', 'y')) r3
...

This at least gets rid of the crazy nesting. It has no negative performance impact.

Sign up to request clarification or add additional context in comments.

1 Comment

This is a good solution... once I understood it. "replaceN" being the highest numbered replace in the sequence, in this example it is replace3
4

Maybe use a function to strip the non-numeric characters:

Create Function [dbo].[RemoveNonAlphaCharacters](@Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin

    While PatIndex('%[^a-z]%', @Temp) > 0
        Set @Temp = Stuff(@Temp, PatIndex('%[^a-z]%', @Temp), 1, '')

    Return @Temp
End

Then you will reference this function in your join:

select a.col1 a, b.col1 b
from tablea a
left join tableb b
  on dbo.RemoveNonAlphaCharacters(a.col1) = b.col1

See SQL Fiddle with Demo

6 Comments

Thanks, that would be an elegant solution. The only problem is that I don't want to remove all non-apha numerics. Sorry for not being clear. There could be cases such as HELLO1114WORLD1 and HELLOWORLD1 I have a very specific set of values (transaction codes to be specific) which I would like to remove
@dopplesoldner so those two values would not be a match then? Can you post a larger data sample and then the expected result?
Hi @bluefeet I would like the following values to be treated as matching HELLO1114WORLD1 HELLOWORLD1 M125ICKY1114MOUSE M125ICKYMOUSE RONA1160DINHO10 RONALDINHO10 Currently by using my nested replace, I am replacing certain known identifiers - eg 1114, 1160 to give me exact matches. Hope it helps and thanks again!
@dopplesoldner can you apply the function to both columns? similar to this demo -- sqlfiddle.com/#!3/11e59/1
Btw thanks for showing me SQL FIddle, I had never heard of it!
|
0

Bluefeet's suggestion would do definitely do a good job of making your query much simpler. However, if you don't want to bother with a function and keep all your code one place, try this. Before you do the join, you could dump table A into a staging table:

DECLARE @TmpA TABLE(
     Column1 [nvarchar] (50)),
     ...
     )

Insert into @tmpA select * from A

Update @TmpA set Column1=Replace(Replace(Replace(Column1,...)))

Select * from tmpA
  left outer join 
Select * from B
  on tmpA.Column1=B.Column1

2 Comments

But this way, I am still left with the nasy nested loop. (which I am trying to avoid in the first place)
Correct, but it would at least move the nested loop out of the Join and into it's own section.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.