4

I want to remove repeating strings in a given table's column.

Here are some examples:

Input     | Expected Output
---------------------------
XYXY      | XY
AA        | A
XYZXYZ    | XYZ
ABCABCABC | ABC

How can I do it?

5
  • 2
    You haven't really asked a specific question...what have you tried so far? Commented Oct 30, 2014 at 11:18
  • Really this question deserves an upvote? Commented Oct 30, 2014 at 11:25
  • 2
    @Ullas: I think so. I find it interesting. For me it is clear and I like to see an answer on it. Commented Oct 30, 2014 at 11:28
  • what if there isn't a repeating string (E.g. input "AB")? Or a repeated string somewhere in the middle (e.g. "AXXB")? Or multiple repeats... etc. etc. Commented Oct 30, 2014 at 11:51
  • See my comments on the answer below, I suspect in the real-world not all your examples are in a nice, uniform, alphabetical order. Commented Oct 30, 2014 at 12:42

2 Answers 2

5

This query will be helpful to you.

SELECT dbo.RemoveDuplicate(ColumnName, VariableLength) FROM TableName.

Example : SELECT dbo.RemoveDuplicate(StudentName, 20) FROM Students.

Function to remove the duplicate string:

CREATE FUNCTION RemoveDuplicate (@sInputString AS VARCHAR(10), @nLength AS INT)
RETURNS VARCHAR(Max) AS  
BEGIN
    DECLARE @count INT
    DECLARE @new_string VARCHAR(Max)
    SET @count=1
    WHILE ( @count <=  @nLength )
      BEGIN
          IF ( @new_string IS NULL )
            BEGIN
                SET @new_string=''
            END
          SET @new_string=@new_string + Substring(@sInputString, 1, 1)
          SET @sInputString=REPLACE(@sInputString, Substring(@sInputString, 1, 1), '')
          SET @count=@count + 1
      END
    RETURN @new_string 
END
Sign up to request clarification or add additional context in comments.

1 Comment

have you found any other optimized way?
1

Used three logic's to get the output.

First is finding distinct letters in each row by using the CTE

Second is having row_number() for each row inside CTE which will be used in the next step.

Third is to concatenate the rows using group by row_number() which is created in the second step.

CREATE TABLE #input
  (name VARCHAR(50))

INSERT INTO #input
VALUES      ('XYXY'),
            ('AA'),
            ('XYZXYZ'),
            ('ABCABCABC');

WITH cte
     AS (SELECT Row_number()OVER (ORDER BY name)    rn,
                Substring(name, 1, 1) AS sub,
                1                     AS IDX,
                name
         FROM   #input
         WHERE  Len(name) > 0
         UNION ALL
         SELECT rn,Substring(name, IDX + 1, 1) AS sub,
                IDX + 1                     AS IDX,
                name
         FROM   cte
         WHERE  IDX < Len(name))
SELECT name INPUT, (SELECT DISTINCT CONVERT(VARCHAR(100), sub)
                 FROM   cte b
                 WHERE  b.rn = a.rn
                 FOR XML PATH('')) EXPECTED_OUTPUT
FROM   cte a
GROUP  BY rn ,name

OUTPUT

INPUT       EXPECTED_OUTPUT
---------   ---------------
AA          A
ABCABCABC   ABC
XYXY        XY
XYZXYZ      XYZ

1 Comment

Good answer, but mangles most non-contrived examples. Try an input of Jamiea it has a repeated a but it mangles the output. As expected by the comments above, the OP has not provided enough info to properly answer this question. All the OP's examples were in alphabetical order, I suspect the real world is not so uniform

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.