1

I want to extract specific strings separated by a comma and parse across the specific columns in SQL server 2008. The table structure in SQL server is as follows:

CREATE TABLE SAMP(COMMASEPA VARCHAR(255),X VARCHAR(10),Y VARCHAR(10),Z VARCHAR(10),A VARCHAR(10),B VARCHAR(10),C VARCHAR(10),D VARCHAR(10))
INSERT INTO SAMP VALUES('X=1,Y=2,Z=3',null,null,null,null,null,null,null),
('X=3,Y=4,Z=5,A=6',null,null,null,null,null,null,null),
('X=1,Y=2,Z=3,A=5,B=6,C=7,D=8',null,null,null,null,null,null,null)

I want the string to be separated based on comma and ONE of the strings in [x/y/z/a/b/c/d]. For example in the result table for first row X=1 should be in X col, Y=2 should be in Y col, Z=3 should be in Z col. Please input any ideas in doing this. Thank you…

1

4 Answers 4

1

You can see this working on SQL Fiddle: http://sqlfiddle.com/#!3/8c3ee/32

Here's the meat of it:

with parsed as (
  select
  commasepa,
  root.value('(/root/s/col[@name="X"])[1]', 'varchar(20)') as X,
  root.value('(/root/s/col[@name="Y"])[1]', 'varchar(20)') as Y,
  root.value('(/root/s/col[@name="Z"])[1]', 'varchar(20)') as Z,
  root.value('(/root/s/col[@name="A"])[1]', 'varchar(20)') as A,
  root.value('(/root/s/col[@name="B"])[1]', 'varchar(20)') as B,
  root.value('(/root/s/col[@name="C"])[1]', 'varchar(20)') as C,
  root.value('(/root/s/col[@name="D"])[1]', 'varchar(20)') as D
FROM
(
select
   commasepa,
   CONVERT(xml,'<root><s><col name="' + REPLACE(REPLACE(COMMASEPA, '=', '">'),',','</col></s><s><col name="') + '</col></s></root>') as root
FROM
  samp
) xml
)
update 
  samp
  set
  samp.x = parsed.x,
  samp.y = parsed.y,
  samp.z = parsed.z,
  samp.a = parsed.a,
  samp.b = parsed.b,
  samp.c = parsed.c,
  samp.d = parsed.d
from
  parsed
where
  parsed.commasepa = samp.commasepa;

Full disclosure - I'm the author of sqlfiddle.com

This works by first converting each commasepa string into an XML object that looks like this:

<root>
 <s>
  <col name="X">1</col>
 </s>
 <s>
  <col name="Y">2</col>
 </s>
  ....
</root>

Once I have the string in that format, I then use the xquery options that SQL Server 2005 (and up) support, which is the .value('(/root/s/col[@name="X"])[1]', 'varchar(20)') part. I select each of the potential columns individually, so they are normalized and populated when available. With that normalized format, I define the result set with a Common Table Expression (CTE) that I called 'parsed'. This CTE is then joined back in the update statement, so that the values can be populated in the original table.

Sign up to request clarification or add additional context in comments.

1 Comment

thanks a lot for the input, its working..if possible can you please explain the root.value in the code and the steps you followed..Thank you..I appreciate it!
0

With the help of a Split function:

CREATE FUNCTION [dbo].[SplitStrings]
(
   @List       VARCHAR(MAX),
   @Delimiter  CHAR(1)
)
RETURNS TABLE
AS
   RETURN ( SELECT Item FROM ( SELECT Item = x.i.value('(./text())[1]', 'varchar(max)') 
      FROM  ( SELECT [XML] = CONVERT(XML, '<i>' + REPLACE(@List, @Delimiter, '</i><i>') 
              + '</i>').query('.') ) AS a CROSS APPLY [XML].nodes('i') AS x(i)
          ) AS y WHERE Item IS NOT NULL
   );
GO

You can do it this way:

;WITH x AS
(
    SELECT s.*, f.Item
        FROM #samp AS s
        CROSS APPLY dbo.SplitStrings(s.COMMASEPA, ',') AS f
), p AS 
( 
    SELECT x.COMMASEPA, 
        X = MAX(CASE WHEN x.Item LIKE 'X=%' THEN x.Item END),
        Y = MAX(CASE WHEN x.Item LIKE 'Y=%' THEN x.Item END),
        Z = MAX(CASE WHEN x.Item LIKE 'Z=%' THEN x.Item END),
        A = MAX(CASE WHEN x.Item LIKE 'A=%' THEN x.Item END),
        B = MAX(CASE WHEN x.Item LIKE 'B=%' THEN x.Item END),
        C = MAX(CASE WHEN x.Item LIKE 'C=%' THEN x.Item END),
        D = MAX(CASE WHEN x.Item LIKE 'D=%' THEN x.Item END)
    FROM x GROUP BY x.COMMASEPA
)
UPDATE s SET X = p.X, Y = p.Y, Z = p.Z, 
  A = p.A, B = p.B, C = p.C, D = p.D
FROM #samp AS s INNER JOIN p 
ON p.COMMASEPA = s.COMMASEPA;

Comments

0
DECLARE @SAMP TABLE
(
  COMMASEPA VARCHAR(255),
  X VARCHAR(10),
  Y VARCHAR(10),
  Z VARCHAR(10),
  A VARCHAR(10),
  B VARCHAR(10),
  C VARCHAR(10),
  D VARCHAR(10)
)
INSERT INTO @SAMP VALUES
('X=1,Y=2,Z=3',null,null,null,null,null,null,null),
('X=3,Y=4,Z=5,A=6',null,null,null,null,null,null,null),
('X=1,Y=2,Z=3,A=5,B=6,C=7,D=8',null,null,null,null,null,null,null)

update S set
  X = case when P.X > 3 then substring(T.COMMASEPA, P.X, charindex(',', T.COMMASEPA, P.X) - P.X) end,
  Y = case when P.Y > 3 then substring(T.COMMASEPA, P.Y, charindex(',', T.COMMASEPA, P.Y) - P.Y) end,
  Z = case when P.C > 3 then substring(T.COMMASEPA, P.Z, charindex(',', T.COMMASEPA, P.Z) - P.Z) end,
  A = case when P.A > 3 then substring(T.COMMASEPA, P.A, charindex(',', T.COMMASEPA, P.A) - P.A) end,
  B = case when P.B > 3 then substring(T.COMMASEPA, P.B, charindex(',', T.COMMASEPA, P.B) - P.B) end,
  C = case when P.C > 3 then substring(T.COMMASEPA, P.C, charindex(',', T.COMMASEPA, P.C) - P.C) end,
  D = case when P.D > 3 then substring(T.COMMASEPA, P.D, charindex(',', T.COMMASEPA, P.D) - P.D) end
from @SAMP as S
  cross apply (select ','+S.COMMASEPA+',') as T(COMMASEPA)
  cross apply (select charindex(',X=', T.COMMASEPA)+3 as X,
                      charindex(',Y=', T.COMMASEPA)+3 as Y,
                      charindex(',Z=', T.COMMASEPA)+3 as Z,
                      charindex(',A=', T.COMMASEPA)+3 as A,
                      charindex(',B=', T.COMMASEPA)+3 as B,
                      charindex(',C=', T.COMMASEPA)+3 as C,
                      charindex(',D=', T.COMMASEPA)+3 as D) as P

1 Comment

Thanks..it solves the problem for the table I quoted as an example as all the numbers are greater than 3, but I think Jakes answer is more universal...thanks for your help
0

Correct my line of thinking here...

Instead of trying to "Comma" delimit a field it would be more prudent to have a second table where you can put your name/value pairs in.

Modify SAMP to have the following field:
ID - integer - Primary Key Auto increment

Create a table NVP
ID - integer - Primary Key Auto increment
SAMPID - integer Foreign key SAMP.ID
Name - varchar(255) - or any realistic size
Value - varchar(255) - or any realistic size

This will allow for the following:
1.  Unlimited fields
2.  Faster Data Access
3.  Since you are not trying to shove several values into 1 field, you now don't have to worry about running out of space.
4.  Less code to worry about trying to split/join data
5.  No longer restricted where you can't store a "," as one of your names or values.

SQL tables should always be relational to take advantage of the power SQL has to offer.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.