Split a delimited string into a table with multiple rows and columns

Question

Please could you assist i am new to SQL and am faced with the scenario below.I have used google and tried to find a solution but have failed.

I have a temporary table named TEMP with a single column named results and rows depending on however long the csv string may be. When you SELECT * FROM #TEMP (The temporary table) it returns data like below:

results

88.47,1,263759,10.00|303.53,2,264051,13.00|147.92,3,264052,6.00|43.26,4,268394,10.00| 127.7,5,269229,4.00|

Please use link below to view what results look like directly from the database:
http://design.northdurban.com/DatabaseResult.png

I need a solution that reads this data from the existing temporary table and insert it into another temporary table with rows and columns like in the link below for example:

The required output is displayed in the link below

http://design.northdurban.com/capture.png

Please could you help as i am sure this post will assist many other users as i have not found any existing solution.

What version of SQL Server are you using? You can use a window function like row_number() to generate the ID column rows. — chridam
– chridam, Commented Jan 23, 2015 at 11:54
In temp table you have 1 column with merged data like 88.47,1,263759,10.00? or you have 1 column and one row with one big merged data 88.47,1,263759,10.00| 303.53,2,264051,13.00|.......?\ — Giorgi Nakeuri
– Giorgi Nakeuri, Commented Jan 23, 2015 at 11:56
I have one column in the temp table with 88.47,1,263759,10.00 and have multiple rows depending on how long the string is. — Zack
– Zack, Commented Jan 23, 2015 at 12:03
You want to insert data from #TEMP1 table to #TEMP2 table...this is what you want to do? — Pranav Bilurkar
– Pranav Bilurkar, Commented Jan 23, 2015 at 12:07

Pரதீப் · Accepted Answer · 2015-01-26 12:56:56Z

1

First convert the string to rows using the demiliter |

DECLARE @str VARCHAR(max)='88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|'

SELECT Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)')))
FROM   (SELECT Cast ('<M>' + Replace(@str, '|', '</M><M>') + '</M>' AS XML) AS Data) AS A
       CROSS APPLY Data.nodes ('/M') AS Split(a)

then convert the result to different column using parsename trick

SELECT Id,c1,c2,c3
FROM  (SELECT Id=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 4), ';', '.'),
              C1=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 3), ';', '.'),
              c2=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 2), ';', '.'),
              c3=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 1), ';', '.')
       FROM   (SELECT Cast ('<M>' + Replace(@str, '|', '</M><M>') + '</M>' AS XML) AS Data) AS A
              CROSS APPLY Data.nodes ('/M') AS Split(a)) a
WHERE  id IS NOT NULL

SQLFIDDLE DEMO

Update: To have a better performance try this.

SELECT c1,c2,c3,c4
FROM   (SELECT C1=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 4), ';', '.'),
               C2=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 3), ';', '.'),
               C3=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 2), ';', '.'),
               C4=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 1), ';', '.')
        FROM   (SELECT Split.a.value('.', 'VARCHAR(100)') col
                FROM   (SELECT Cast ('<M>' + Replace(@str, '|', '</M><M>') + '</M>' AS XML) AS Data) AS A
                       CROSS APPLY Data.nodes ('/M') AS Split(a))v) a
WHERE  c1 IS NOT NULL;

Update2: To parse more than one row from the table use this code.

Sample table with data

create table #test(string varchar(8000))
insert into #test values
('88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|'),
('88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|')

Query

SELECT c1,c2,c3,c4
FROM   (SELECT C1=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 4), ';', '.'),
               C2=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 3), ';', '.'),
               C3=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 2), ';', '.'),
               C4=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 1), ';', '.')
        FROM   (SELECT Split.a.value('.', 'VARCHAR(100)') col
                FROM   (SELECT Cast ('<M>' + Replace(string, '|', '</M><M>') + '</M>' AS XML)
                 AS Data from #test) AS A
                       CROSS APPLY Data.nodes ('/M') AS Split(a))v) a
WHERE  c1 IS NOT NULL;

edited Jan 26, 2015 at 12:56

answered Jan 23, 2015 at 12:13

Pரதீப்

94.1k21 gold badges144 silver badges179 bronze badges

Sign up to request clarification or add additional context in comments.

12 Comments

Martin Brown Over a year ago

Interesting. I wonder how quick this would be? It requires scanning over the string a huge number of times.

Pரதீப் Over a year ago

Downvoter can you please comment so that i can improve my answer

Pரதீப் Over a year ago

@MartinBrown - After converting it to rows each row is going to hold few data not the entire string

Martin Brown Over a year ago

That was me and I did.

Martin Brown Over a year ago

First you are doing a split which scans the string once. Then for each part of the split you are doing a split which scans the string again. Then for each part of that you are doing two replaces and a ParseName which is scanning each part three times. In total you are scanning the whole string five times. Also there are going to be a large number of intermediate strings created which all require memory allocations which if memory is fragmented may be slow. While this won't be an issue if there is only one string, if you have a couple of million strings to process that is going to add up.

|

Giorgi Nakeuri · Accepted Answer · 2015-01-23 12:21:46Z

0

This will only work if you you have 4 columns. In this situation you can do the following

SELECT REPLACE(PARSENAME(REPLACE(REPLACE(ColumnName, '.', '~'), ',', '.'), 4), '~', '.'),
 REPLACE(PARSENAME(REPLACE(REPLACE(ColumnName, '.', '~'), ',', '.'), 3), '~', '.'),
 REPLACE(PARSENAME(REPLACE(REPLACE(ColumnName, '.', '~'), ',', '.'), 2), '~', '.'),
 REPLACE(PARSENAME(REPLACE(REPLACE(ColumnName, '.', '~'), ',', '.'), 1), '~', '.')
From #TEMP

answered Jan 23, 2015 at 12:21

Giorgi Nakeuri

35.9k8 gold badges50 silver badges78 bronze badges

3 Comments

Pரதீப் Over a year ago

Just changed ~ symbol instead of ; in my answer

Giorgi Nakeuri Over a year ago

Sorry but you have Cross applies, for xml and other unneeded stuff. I didn't rewrite or copy pasted your solution. I wrote it by myself.

Pரதீப் Over a year ago

this is just half solution

Martin Brown · Accepted Answer · 2015-01-25 22:44:55Z

You can write a table value function to parse the string like this:

CREATE FUNCTION dbo.parseData ( @stringToSplit VARCHAR(MAX) )
RETURNS
    @return TABLE (ID int, Column1 real, Column2 int, Column3 int, Column4 real)
AS
BEGIN

    DECLARE @char char;
    DECLARE @len int = LEN(@stringToSplit);    

    DECLARE @buffer varchar(50) = '';

    DECLARE @field int = 1;

    DECLARE @Column1 real
    DECLARE @Column2 int
    DECLARE @Column3 int
    DECLARE @Column4 real

    DECLARE @row int = 1

    DECLARE @i int = 1;
    WHILE @i <= @len BEGIN

        SELECT @char = SUBSTRING(@stringToSplit, @i, 1)

        IF @char = ','
        BEGIN
            IF @field = 1
                SET @Column1 = CONVERT(real, @buffer);
            ELSE IF @field = 2
                SET @Column2 = CONVERT(int, @buffer);
            ELSE IF @field = 3
                SET @Column3 = CONVERT(int, @buffer);    
            SET @buffer = '';
            SET @field = @field + 1
        END
        ELSE IF @char = '|'
        BEGIN
            SET @Column4 = CONVERT(real, @buffer);
            INSERT INTO @return (ID, Column1, Column2, Column3, Column4)
            VALUES (@row, @Column1, @Column2, @Column3, @Column4);
            SET @buffer = '';
            SET @row = @row + 1
            SET @field = 1
        END
        ELSE
        BEGIN
            SET @buffer = @buffer + @char
        END

        SET @i = @i + 1;
    END

    RETURN
END
GO

And then call that function like this:

SELECT Col1 = '88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|'
INTO #Temp1;

INSERT INTO #Temp1
VALUES ('88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|')

SELECT data.*
INTO #Temp2
FROM #Temp1 CROSS APPLY parseData(#Temp1.Col1) as data

SELECT *
FROM #Temp2

DROP TABLE #Temp1
DROP TABLE #Temp2

Performance:

So I ran a performance test of this technique against the technique described by NoDisplayName. Over 10,000 iterations my technique took 13,826 and NoDisplayName's took 36,176 so mine only takes 38% of the time NoDisplayName's does.

To test this I used an Azure database and ran the following script.

-- First two queries to check the results are the same.
-- Note the Parsename technique returns strings rather than reals which is why
-- the last column has .00 at the end of the numbers in the Parsename tecnique.
DECLARE @str VARCHAR(max)='88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.01|'

SELECT c1,c2,c3, c4
    FROM  (SELECT C1=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 4), ';', '.'),
                  C2=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 3), ';', '.'),
                  C3=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 2), ';', '.'),
                  C4=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 1), ';', '.')
           FROM   (SELECT Cast ('<M>' + Replace(@str, '|', '</M><M>') + '</M>' AS XML) AS Data) AS A
                  CROSS APPLY Data.nodes ('/M') AS Split(a)) a
    WHERE  c1 IS NOT NULL;

SELECT *
FROM dbo.parseData(@str)
GO

-- Now lets time the Parsename method over 10,000 itterations
SET NOCOUNT ON;

DECLARE @str VARCHAR(max)='88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|'

DECLARE @i int = 0
declare @table table (c1 decimal, c2 int, c3 int, c4 decimal)

DECLARE @Start datetime = GETDATE();

while @i < 1000
begin

    INSERT INTO @table
    SELECT c1,c2,c3, c4
    FROM  (SELECT C1=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 4), ';', '.'),
                  C2=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 3), ';', '.'),
                  C3=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 2), ';', '.'),
                  C4=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 1), ';', '.')
           FROM   (SELECT Cast ('<M>' + Replace(@str, '|', '</M><M>') + '</M>' AS XML) AS Data) AS A
                  CROSS APPLY Data.nodes ('/M') AS Split(a)) a
    WHERE  c1 IS NOT NULL;

    DELETE FROM @table;

    set @i = @i + 1;
end

DECLARE @End datetime = GETDATE()
PRINT CONVERT(nvarchar(50),@Start,126) + ' - ' + convert(nvarchar(50),@End,126) + ' - ' + convert(nvarchar(50), DATEDIFF(ms, @start, @end))
GO

-- Now the my technique over 10,000 itterations
SET NOCOUNT ON;

DECLARE @str VARCHAR(max)='88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|'

DECLARE @i int = 0
declare @table table (c1 decimal, c2 int, c3 int, c4 decimal)

DECLARE @Start datetime = GETDATE();

while @i < 1000
begin

    INSERT INTO @table
    SELECT *
    FROM dbo.parseData(@str)
    DELETE FROM @table;

    set @i = @i + 1;
end
DECLARE @End datetime = GETDATE()
PRINT CONVERT(nvarchar(50),@Start,126) + ' - ' + convert(nvarchar(50),@End,126) + ' - ' + convert(nvarchar(50), DATEDIFF(ms, @start, @end))
GO

Collectives™ on Stack Overflow

Split a delimited string into a table with multiple rows and columns

3 Answers 3

SQLFIDDLE DEMO

12 Comments

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

12 Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related