1

I have values like this in one of the column PIDS of my SQL Server table. I would like to split this column into two separate columns based on the ID value please. Any value after GID: should go to GID column, and any value after PACKAGEID: should go to PACKAGEID column. Could you please help me with this?

PIDs
GID: 2672, PACKAGEID: 91290
PACKAGEID: 130116, GID: 7d78b
GID: 09e541, PACKAGEID: 17105
GID: 14e3ba, PACKAGEID: 80017
PACKAGEID: 730829, GID: a871c
PACKAGEID: 1009409, GID: c8b2

Expected output - either like this:

GID PACKAGEID
2672 91290
7d78b 130116
09e541 17105
14e3ba 80017
a871c 730829
c8b2 1009409

or this:

GID PACKAGEID
GID:2672 PACKAGEID:91290
GID:7d78b PACKAGEID:130116
GID:09e541 PACKAGEID:17105
GID:14e3ba PACKAGEID:80017
GID:a871c PACKAGEID:730829
GID:c8b2 PACKAGEID:1009409
9
  • 3
    SQL Server has scant native support for regex. You should consider extracting the GID and PACKAGEID before bringing this data into SQL Server. Commented Jun 1, 2024 at 4:35
  • 1
    What have you tried? Where did you get stuck? A quick search would have quickly shown that T-SQL doesn't support regex, but would have shown you what others have used instead. Commented Jun 1, 2024 at 4:38
  • 1
    If you follow @TimBiegeleisen's advice and do the pre-processing in a language that uses the C# regex engine you could extract values for GID and PACKAGEID in each line with the regex ^(?=.*\bGID: *(?<GID>[\da-z]+\b))(?=.*\bPACKAGEID: *(?<PACKAGEID>\d+\b)) that uses named capture groups. Demo. See the answers here for examples on how to extract values of named capture groups. Commented Jun 1, 2024 at 4:57
  • 1
    There are loads of T-SQL solutions available for string splitting with a quick google search if thats the approach you wish to take e.g. string_split Commented Jun 1, 2024 at 5:04
  • 1
    Your data almost looks like maybe it originally came from JSON. If you instead import that original JSON, you might be able to use SQL Server's JSON functions to extract the values. Commented Jun 1, 2024 at 5:15

4 Answers 4

3

One way of doing this would be some string manipulation to convert the string into an XML element with attributes named GID and PACKAGEID and then use XML methods on that

SELECT PIDs,
       GID = X.value('x[1]/@GID', 'VARCHAR(100)'),
       PACKAGEID = X.value('x[1]/@PACKAGEID', 'VARCHAR(100)')
FROM Data
CROSS APPLY (VALUES (CONVERT(XML,
                '<x ' + REPLACE(REPLACE(PIDs, ': ', '="'),',','" ') + '" />'
            ))) V(X)

Or much the same approach but using JSON instead (which gives a simpler execution plan than the XML one)

SELECT PIDs, GID, PACKAGEID
FROM Data
CROSS APPLY 
     OPENJSON('{"' + REPLACE(REPLACE(PIDs, ': ', '":"'), ', ', '","') + '"}')
WITH 
     (GID VARCHAR(100), PACKAGEID VARCHAR(100))

Fiddle

Sign up to request clarification or add additional context in comments.

Comments

2

You can accomplish this by using a combination of CHARINDEX(), LEFT(), and either RIGHT() or STUFF() to manually split the strings in two steps. Serialize the first split strings into a CROSS APPLY(VALUES ...) that feeds the second split. After the second split, the resulting name/value combinations cam be reorganized using conditional aggregation to get your final results.

SELECT *
FROM Data D
CROSS APPLY (
    SELECT
        MAX(CASE WHEN NV.Name = 'GID' THEN NV.Value END) AS GID,
        MAX(CASE WHEN NV.Name = 'PACKAGEID' THEN NV.Value END) AS PACKAGEID
    FROM (SELECT CHARINDEX(',', D.PIDs) AS Pos) P
    CROSS APPLY (
        VALUES
            (LEFT(D.PIDs, P.Pos - 1)),
            (STUFF(D.PIDs, 1, P.Pos, ''))
    ) I(Item)
    CROSS APPLY (SELECT CHARINDEX(':', I.Item) AS Pos2) P2
    CROSS APPLY (
        SELECT
            TRIM(LEFT(I.Item, P2.Pos2 - 1)) AS Name,
            TRIM(STUFF(I.Item, 1, P2.Pos2, '')) AS Value
    ) NV
) A

If using SQL Server 2022 or later, you can use the STRING_SPLIT() function to accomplish the same. (Note that the second STRING_SPLIT() below requires use of the enable_ordinal option, not available prior to 2022.)

SELECT *
FROM Data D
CROSS APPLY (
    SELECT
        MAX(CASE WHEN NV.Name = 'GID' THEN NV.Value END) AS GID,
        MAX(CASE WHEN NV.Name = 'PACKAGEID' THEN NV.Value END) AS PACKAGEID
    FROM STRING_SPLIT(D.PIDs, ',') S
    CROSS APPLY (
        SELECT
            TRIM(MAX(CASE WHEN S2.ordinal = 1 THEN S2.value END)) AS Name,
            TRIM(MAX(CASE WHEN S2.ordinal = 2 THEN S2.value END)) AS Value
        FROM STRING_SPLIT(S.value, ':', 1) S2
    ) NV
) A

For versions older than 2017, you will need to replace the TRIM() function with LTRIM() or an LTRIM(RTRIM()) combo.

Results:

PIDs GID PACKAGEID
GID: 2672, PACKAGEID: 91290 2672 91290
PACKAGEID: 130116, GID: 7d78b 7d78b 130116
GID: 09e541, PACKAGEID: 17105 09e541 17105
GID: 14e3ba, PACKAGEID: 80017 14e3ba 80017
PACKAGEID: 730829, GID: a871c a871c 730829
PACKAGEID: 1009409, GID: c8b2 c8b2 1009409

See this db<>fiddle for a demo.

Comments

1

You can convert PIDs column to JSON and then take JSON_VALUE.

See example

select 
  json_value(jPIDs,'$.GID') GID
 ,json_value(jPIDs,'$.PACKAGEID') PACKAGEID
  ,*
from(
  select *
    ,replace('{"'+replace(replace(pids,':','":"'),',','","')+'"}',' ','') jPIDs
  from test
)a
GID PACKAGEID PIDs jPIDs
2672 91290 GID: 2672, PACKAGEID: 91290 {"GID":"2672","PACKAGEID":"91290"}
7d78b 130116 PACKAGEID: 130116, GID: 7d78b {"PACKAGEID":"130116","GID":"7d78b"}
09e541 17105 GID: 09e541, PACKAGEID: 17105 {"GID":"09e541","PACKAGEID":"17105"}
14e3ba 80017 GID: 14e3ba, PACKAGEID: 80017 {"GID":"14e3ba","PACKAGEID":"80017"}
a871c 730829 PACKAGEID: 730829, GID: a871c {"PACKAGEID":"730829","GID":"a871c"}
c8b2 1009409 PACKAGEID: 1009409, GID: c8b2 {"PACKAGEID":"1009409","GID":"c8b2"}

Demo

Comments

1

This can also be done using text functions availabale in SQL Server. You can achived the first expected output like this:

with flo as(
SELECT PIDS, SUBSTRING(PIDS,1 ,CHARINDEX(',',PIDS)-1)AS left_side, SUBSTRING(PIDS, CHARINDEX(',',PIDS)+2,255)AS right_side
FROM Data 
), flo1 as (
select case when left(left_side,1)<left(right_side,1) then left_side else right_side end as GID, 
case when left(left_side,1)>left(right_side,1) then left_side else right_side end as Packageid
from flo
)
select substring(gid, charindex(':',gid)+2,255)as gid,
substring(packageid, charindex(':',packageid)+2,255)as packageid
from flo1

And the second expected output like this:

with flo as(
SELECT PIDS, SUBSTRING(PIDS,1 ,CHARINDEX(',',PIDS)-1)AS left_side, SUBSTRING(PIDS, CHARINDEX(',',PIDS)+2,255)AS right_side
FROM Data 
)
select case when left(left_side,1)<left(right_side,1) then left_side else right_side end as GID,
 case when left(left_side,1)>left(right_side,1) then left_side else right_side end as Packageid
from flo

Hope it helps.

2 Comments

You might want to trim your computed data, as there might be some unwanted spaces here and there, see this demo
Formatting would make your queries much easier to read.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.