3

I have a column on a SQL Server table that has a lengthy value with line breaks. I am trying to find the values before each line break.

EXAMPLE:

--Column name: ItemDescription

Case Qty: 12
Weight: 8 oz.
Flavor code: PB
Size: STOCK
Cut: 1/8" x 1/8" x 3/16"
Additions: Bells
Cover Brine #: P1
Kosher Cert: OU
Organic Cert: 

This is EXACTLY what the I get when I copy the cell on my results and paste it. So I converted this field to VARBINARY and saw what ASCII codes are in here. Here is a part of ASCII interpretation of a value:

43 61 73 65 20 51 74 79 3A 20 31 32 0D0A 57 65 69 67 68 74 3A 20 38 20 6F 7A 2E 0D0A 46

0D0A meaning Carriage return and Line feed.

PREFERRED OUTCOME:

Now that the data is clear, I am trying to find the value after the colon and before the line break and put it in a new column.

So here is what my preferred outcome should look like:

enter image description here

WHAT I'VE TRIED:

Here is my current SQL Query to do this:

DECLARE @firstLine int
DECLARE @secondLine int
DECLARE @thirdLine int
DECLARE @fourthLine int
DECLARE @fifthLine int
DECLARE @sixthLine int
DECLARE @seventhLine int
DECLARE @eighthLine int


DECLARE @firstColon int
DECLARE @secondColon int
DECLARE @thirdColon int
DECLARE @fourthColon int
DECLARE @fifthColon int
DECLARE @sixthColon int
DECLARE @seventhColon int
DECLARE @eighthColon int
DECLARE @ninethColon int
DECLARE @itemDesc varchar(MAX)

SELECT 
    @itemDesc = ItemDescription
    ,@firstLine = CHARINDEX(CHAR(13), ItemDescription, 1)
    ,@secondLine = CHARINDEX(CHAR(13), ItemDescription, @firstLine + 1)
    ,@thirdLine =  CHARINDEX(CHAR(13), ItemDescription, @secondLine + 1)
    ,@fourthLine =  CHARINDEX(CHAR(13), ItemDescription, @thirdLine + 1)
    ,@fifthLine =  CHARINDEX(CHAR(13), ItemDescription, @fourthLine+ 1)
    ,@sixthLine =  CHARINDEX(CHAR(13), ItemDescription, @fifthLine + 1)
    ,@seventhLine =  CHARINDEX(CHAR(13), ItemDescription, @sixthLine + 1)
    ,@eighthLine = CHARINDEX(CHAR(13), ItemDescription, @seventhLine + 1)

    ,@firstColon = CHARINDEX(CHAR(58), ItemDescription, 1)--aaa
    ,@secondColon = CHARINDEX(CHAR(58), ItemDescription, @firstLine + 1)
    ,@thirdColon = CHARINDEX(CHAR(58), ItemDescription, @secondLine + 1)
    ,@fourthColon = CHARINDEX(CHAR(58), ItemDescription, @thirdLine + 1)
    ,@fifthColon = CHARINDEX(CHAR(58), ItemDescription, @fourthLine + 1)
    ,@sixthColon = CHARINDEX(CHAR(58), ItemDescription, @fifthLine + 1)
    ,@seventhColon = CHARINDEX(CHAR(58), ItemDescription, @sixthLine + 1)
    ,@eighthColon = CHARINDEX(CHAR(58), ItemDescription, @seventhLine + 1)
    ,@ninethColon = CHARINDEX(CHAR(58), ItemDescription, @eighthLine + 1)
FROM TableName

SELECT 
    ItemDescription
    ,CONVERT(VarBInary, itemDescription)
    ,LTRIM(SUBSTRING(ItemDescription, @firstColon + 2, @firstLine - (@firstColon - 1))) as caseQty          --1
    ,LTRIM(SUBSTRING(ItemDescription, @secondColon + 2, @secondLine - (@secondColon - 1))) as caseQty           --2
    ,LTRIM(SUBSTRING(ItemDescription, @thirdColon + 2, @thirdLine - (@thirdColon - 1))) as FlavorCode
    ,LTRIM(SUBSTRING(ItemDescription, @fourthColon + 2, @fourthLine - (@fourthColon - 1))) as Size
    ,LTRIM(SUBSTRING(ItemDescription, @fifthColon + 2, @fifthLine - (@fifthColon - 1))) as Cut
    ,LTRIM(SUBSTRING(ItemDescription, @sixthColon + 2, @sixthLine - (@sixthColon - 1))) as Additions
    ,LTRIM(SUBSTRING(ItemDescription, @eighthColon + 2, @eighthLine- (@eighthColon - 1))) as Brine
FROM
    TableName

THE ISSUE:

For some reason, the SUBSTRING isn't getting the right substring! I get the Qty and Weight correctly. But if size is RELISHSTOCK, I get RELISHSTOC. For FlavorCode, I get ut: (substring of "Cut:"). For Cut, I get 8" x 3/. For Additions, I get Brin (Substring of "Cover Brine").

WHAT am I doing wrong!! I've tried multiple different combinations of the substring length. I know for a fact it is the length. But now that I realize all the starting positions after flavor code is screwed up, I am not too sure where the mistake is.

Any help would be greatly appreciated.

Thank you.

8
  • 1
    T-SQL isn't meant for string manipulation, much less parsing. If you want to store complex data store them as XML or Json. It's a lot easier to parse data with a language like C# when loading/inserting it Commented Dec 15, 2017 at 14:24
  • @PanagiotisKanavos See, that is the thing. The data in this table is simple enough to be able to get out with just T-SQL. I am horrible at XML and don't even want to open that can of worms for myself. But this data isn't going anywhere. It is just going to a SSRS report. So I thought I can make it work like this. Commented Dec 15, 2017 at 14:28
  • Besides, if you already know that the data will have specific fields, why not create a table for it in the first place? Parsing strings with a regex in C# is very easy. Extract the key/value pairs and insert them as field values to the table Commented Dec 15, 2017 at 14:29
  • 3
    The problem is your data violates 1NF by storing multiple pieces of data in a single tuple. Yours goes even further by storing multiple rows of multiple data points in a single tuple. You have an EAV type pattern stored in a single row. This is a nightmare to deal with sql. Commented Dec 15, 2017 at 14:37
  • 1
    One possible fix (not a solution, the real solution is to change the way to load data and insert it directly into the ) is to convert it to JSON, then use SQL Server's JSON functions to extract attributes. For example, replacing each newline with ',', each : with ' : ' and wrapping everything with '{' and '} would make it look like JSON. You could then use JSON_QUERY to read the properties. This will be slow though. Fix the data Commented Dec 15, 2017 at 14:38

1 Answer 1

2

After about a week of struggling with this problem, and a week of trying to move my execution idea to the suggestions mentioned in the comments, I ended up finding a solution to this problem.

Step 1:
I found out that it was easier to manipulate a special character like ! than it was for line break and line feed (Char(10) and Char(13)).
So the first thing I did was to replace all the line break characters to !. Like so:

REPLACE(REPLACE(columnName, char(10), ''), char(13), '!')

This makes sure that the ! always has a space before it, making it easier to deal with.

Step 2:
I found an answer on SO that had a table-value function to split a string. The function name is fnSplitString and it takes two parameters: @string nvarchar(MAX) and @delimited char(1).

ALTER FUNCTION [dbo].[fnSplitString] 
( 
    @string NVARCHAR(MAX), 
    @delimiter CHAR(1) 
) 
RETURNS @output TABLE(columnName NVARCHAR(MAX), outValue NVARCHAR(MAX)) 
BEGIN 
    DECLARE 
        @start INT
        ,@end INT
        ,@colon INT

    SELECT 
        @start = 1
        ,@colon = CHARINDEX(@delimiter, @string)
        ,@end = CHARINDEX(CHAR(33), @string)
    WHILE @start < LEN(@string) + 1 BEGIN 
        IF(SUBSTRING(@string, @colon + 1, 1) = ' ')
        BEGIN
            INSERT INTO 
                @output (columnName, outValue)  
            VALUES
                ((SUBSTRING(@string, @start, @colon - @start)) ,(SUBSTRING(@string, @colon + 2,  @end - @colon - 2)))
            SET @start = @end + 1 
            SET @colon = CHARINDEX(@delimiter, @string, @start)
            SET @end = CHARINDEX(CHAR(33), @string, @start)
        END
        ELSE
        BEGIN
            INSERT INTO 
                @output (columnName, outValue)  
            VALUES
                ((SUBSTRING(@string, @start, @colon - @start)) ,(SUBSTRING(@string, @colon + 1,  @end - @colon - 1)))
            SET @start = @end + 1 
            SET @colon = CHARINDEX(@delimiter, @string, @start)
            SET @end = CHARINDEX(CHAR(33), @string, @start)
        END
    END 
    RETURN 
END

I had to change a lot to the function because this function was originally meant for getting values AFTER a delimiter up to the end of the string, which is pretty straightforward. But mine had to go from the delimiter to the !, which is char(33).

Step 3:
I wrote a stored procedure that handles all these operations. I already had a pretty lengthy stored procedure in place for my requirement and I had to add the results of this function joined with the results of a select statement to a new CTE and use that in the final result set.

For anyone that is curious, found in another SO question that the join of a table and a function can be done like so:

SELECT
    table1.columnName1
    ,function1.columnName2
    ,function1.columnName3
FROM
    tableName1 table1
CROSS APPLY
    dbo.functionName1(table1.columnName2, ':') function1

That did it.

I hope this helps someone in the future.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.