0

Since it's not that easy to perform regex in SQL I'll need some advice in how to solve this problem.

I have a column with the following type of data:

Lorem ipsum dolor sit $%foo##amet%$, consectetur adipiscing elit. Nullam odio risus, mollis a interdum vitae, rutrum id leo. Pellentesque dapibus lobortis mattis. Praesent at nisi a orci commodo scelerisque $%bar##%$ eget id dui. Morbi est arcu, ultricies et consequat ac, pretium sed mi. Quisque iaculis pretium congue. Etiam ullamcorper sapien eu mauris tristique at venenatis mauris ultricies. Proin eu vehicula enim. Vestibulum aliquam, mauris ac tempus vulputate, odio mauris rhoncus purus, id suscipit velit erat quis magna.

The bold text I need to match, and it needs to be replaced with the text found in the second part.

Meaning:

  • $%foo##amet%$ becomes amet
  • $%bar##%$ becomes an empty string.

The pattern as a regex would be something like \$%[^#]+?##([^%]*?)%\$

I can't really use that though since regex is not really supported in tsql...

Any advice?

4
  • I suggest you doing that in a programming language and not in SQL. SQL is not optimized for such things. It's a DB query language and not a text processing optimized language. Commented Nov 14, 2011 at 10:44
  • @m0skit0 Problem with that would be performance, its a table thats deleted and filled each night with over 900.000 rows Commented Nov 14, 2011 at 10:48
  • In fact the performance hit would be more using SQL IMHO. A test would clear this. Commented Nov 14, 2011 at 11:13
  • @red-X: if you're worried about performance - then I'd seriously give SQL-CLR a try! Since the .NET code would be executed within SQL Server, you don't have any of the network traffic and thus better performance overall ... Commented Nov 14, 2011 at 11:52

3 Answers 3

1

The best option you have is using a nested REPLACE with fixed matching string:

SELECT REPLACE(
           REPLACE(YourColumn, '$[%]foo##amet[%]$', 'amet'), '$[%]bar##[%]$', '')

Of course, that doesn't have the flexibility of regexes....

Or you could design a SQL-CLR regex library (or find one of the pre-existing ones) and include those into your SQL Server - from 2005 on, SQL Server can execute .NET code, and something like a regex library is a perfect example for extending T-SQL with .NET capabilities.

Sign up to request clarification or add additional context in comments.

2 Comments

Since a lot of other software depend on the DB and i have no experience with CLR's I'm a little hesitant in adding them. There are a lot(understatement) of different combinations so the first suggestions not gonna do it either.
@red-X: well, in that case, you probably can't do it within SQL Server, but you'd have to read the data from a .NET app and do the pattern matching/replacement there, and then update any rows changed.....
0

You have to do the following code:

SELECT REPLACE(REPLACE(YourColumn, ' $%amet%$ ', ' amet '), ' $%bar%$ ', ' ')

The % character represents "Any string of zero or more characters" as explained here. You need to put spaces to identify only single words :)

4 Comments

The strings can be pretty much anything so that would be a LOT of replaces that need to be edited into the query anytime another one gets added so thats out of the question.
ohh, it's not only those two examples. Are they allways one word? (without spaces) and do you want to replace the words that have amet or bar?
I want to replace the complete pattern from $ to $ with the second word (so amet and the empty string in the example)
If you just want to replace ..amet.. and ...bar.. for bar and empty you can use my answer.
0

fixed it with this function:

CREATE FUNCTION [dbo].[ReplaceWithDefault]
(
   @InputString VARCHAR(4000)
)
RETURNS VARCHAR(4000)
AS
BEGIN
    DECLARE @Pattern VARCHAR(100) SET @Pattern = '$[%]_%##%[%]$'
    -- working copy of the string
    DECLARE @Result VARCHAR(4000) SET @Result = @InputString
    -- current match of the pattern
    DECLARE @CurMatch VARCHAR(500) SET @curMatch = ''
    -- string to replace the current match
    DECLARE @Replace VARCHAR(500) SET @Replace = ''
    -- start + end of the current match
    DECLARE @Start INT
    DECLARE @End INT
    -- length of current match
    DECLARE @CurLen INT
    -- Length of the total string -- 8001 if @InputString is NULL
    DECLARE @Len INT SET @Len = COALESCE(LEN(@InputString), 8001)

    WHILE (PATINDEX('%' + @Pattern + '%', @Result) != 0) 
    BEGIN
        SET @Replace = ''

        SET @Start = PATINDEX('%' + @Pattern + '%', @Result)
        SET @CurMatch = SUBSTRING(@Result, @Start, @Len)

        SET @End = PATINDEX('%[%]$%', @CurMatch) + 2
        SET @CurMatch = SUBSTRING(@CurMatch, 0, @End)

        SET @CurLen = LEN(@CurMatch)

        SET @Replace = REPLACE(RIGHT(@CurMatch, @CurLen - (PATINDEX('%##%', @CurMatch)+1)), '%$', '')

        SET @Result = REPLACE(@Result, @CurMatch, @Replace)
    END
    RETURN(@Result)
END

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.