Replace value in column based on another column

Question

I have the following table:

+----+--------+------------+----------------------+
| ID |  Name  | To_Replace |       Replaced       |
+----+--------+------------+----------------------+
|  1 | Fruits | 1          | Fruits               |
|  2 | Apple  | 1-2        | Fruits-Apple         |
|  3 | Citrus | 1-3        | Fruits-Citrus        |
|  4 | Orange | 1-3-4      | Fruits-Citrus-Orange |
|  5 | Empire | 1-2-5      | Fruits-Apple-Empire  |
|  6 | Fuji   | 1-2-6      | Fruits-Apple-Fuji    |
+----+--------+------------+----------------------+

How can I create the column Replaced ? I thought of creating 10 maximum columns (I know there are no more than 10 nested levels) and fetch the ID from every substring split by '-', and then concatenating them if not null into Replaced, but I think there is a simpler solution.

First thought is creating a function to pass the To_Replace values and return Replaced. — SS_DBA
– SS_DBA, Commented Jul 17, 2020 at 19:01

GMB · Accepted Answer · 2020-07-18 00:33:44Z

3

While what you ask for is technically feasible (probably using a recursive query or a tally), I will take a different stance and suggest that you fix your data model instead.

You should not be storing multiple values as a delimited list in a single database column. This defeats the purpose of a relational database, and makes simple things both unnecessarily complicated and inefficient.

Instead, you should have a separate table to store that data, which each replacement id on a separate row, and possibly a column that indicates the sequence of each element in the list.

For your sample data, this would look like:

id    replace_id    seq
1     1             1
2     1             1
2     2             2
3     1             1
3     3             2
4     1             1
4     3             2
4     4             3
5     1             1
5     2             2
5     5             3
6     1             1
6     2             2
6     6             3

Now you can efficiently generate the expected result with either a join, a subquery, or a lateral join. Assuming that your table is called mytable and that the mapping table is mymapping, the lateral join solution would be:

select t.*, r.*     
from mytable t
outer apply (
    select string_agg(t1.name) within group(order by m.seq) replaced
    from mymapping m
    inner join mytable t1 on t1.id = m.replace_id
    where m.id = t.id
) x

answered Jul 18, 2020 at 0:33

GMB

224k25 gold badges103 silver badges151 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

RegularNormalDayGuy Over a year ago

I might have cut off some important information on my issue, but it is related to this. This table represents data stored in a hierarchical way; we're storing information based on a parent and we can have infinite levels within the same table. The issue is that I'm having a hard time updating my FullPath and this question was one idea I had that I though was worth to investiguate.

critical_error Over a year ago

@RegularNormalDayGuy you could consider making your "To_Replace" column XML (or just store the values in an XML format--or JSON for that matter, depending on your SQL version) as another way of simplification and avoid some of the additional XML work needed in my example.

critical_error · Accepted Answer · 2020-07-18 18:42:03Z

1

You can try something like this:

DECLARE @Data TABLE ( ID INT, [Name] VARCHAR(10), To_Replace VARCHAR(10) );
INSERT INTO @Data ( ID, [Name], To_Replace ) VALUES
( 1, 'Fruits', '1' ),
( 2, 'Apple', '1-2' ),
( 3, 'Citrus', '1-3' ),
( 4, 'Orange', '1-3-4' ),
( 5, 'Empire', '1-2-5' ),
( 6, 'Fuji', '1-2-6' );

SELECT
    *
FROM @Data AS d
OUTER APPLY (

    SELECT STRING_AGG ( [Name], '-' ) AS Replaced FROM @Data WHERE ID IN (
        SELECT CAST ( [value] AS INT ) FROM STRING_SPLIT ( d.To_Replace, '-' )
    )

) List
ORDER BY ID;

Returns

+----+--------+------------+----------------------+
| ID |  Name  | To_Replace |       Replaced       |
+----+--------+------------+----------------------+
|  1 | Fruits | 1          | Fruits               |
|  2 | Apple  | 1-2        | Fruits-Apple         |
|  3 | Citrus | 1-3        | Fruits-Citrus        |
|  4 | Orange | 1-3-4      | Fruits-Citrus-Orange |
|  5 | Empire | 1-2-5      | Fruits-Apple-Empire  |
|  6 | Fuji   | 1-2-6      | Fruits-Apple-Fuji    |
+----+--------+------------+----------------------+

UPDATE

Ensure the id list order is maintained when aggregating names.

DECLARE @Data TABLE ( ID INT, [Name] VARCHAR(10), To_Replace VARCHAR(10) );
INSERT INTO @Data ( ID, [Name], To_Replace ) VALUES
    ( 1, 'Fruits', '1' ),
    ( 2, 'Apple', '1-2' ),
    ( 3, 'Citrus', '1-3' ),
    ( 4, 'Orange', '1-3-4' ),
    ( 5, 'Empire', '1-2-5' ),
    ( 6, 'Fuji', '1-2-6' ),
    ( 7, 'Test', '6-2-7' );

SELECT
    *
FROM @Data AS d
OUTER APPLY (

    SELECT STRING_AGG ( [Name], '-' ) AS Replaced FROM (
        
        SELECT TOP 100 PERCENT
            Names.[Name]
        FROM ( SELECT CAST ( '<ids><id>' + REPLACE ( d.To_Replace, '-', '</id><id>' ) + '</id></ids>' AS XML ) AS id_list ) AS xIds
        CROSS APPLY (
            SELECT 
                x.f.value('.', 'INT' ) AS name_id, 
                ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL ) ) AS row_id
            FROM xIds.id_list.nodes('//ids/id') x(f)
        ) AS ids
        INNER JOIN @Data AS Names ON Names.ID = ids.name_id
        ORDER BY row_id

    ) AS x

) List
ORDER BY ID;

Returns

+----+--------+------------+----------------------+
| ID |  Name  | To_Replace |       Replaced       |
+----+--------+------------+----------------------+
|  1 | Fruits | 1          | Fruits               |
|  2 | Apple  | 1-2        | Fruits-Apple         |
|  3 | Citrus | 1-3        | Fruits-Citrus        |
|  4 | Orange | 1-3-4      | Fruits-Citrus-Orange |
|  5 | Empire | 1-2-5      | Fruits-Apple-Empire  |
|  6 | Fuji   | 1-2-6      | Fruits-Apple-Fuji    |
|  7 | Test   | 6-2-7      | Fuji-Apple-Test      |
+----+--------+------------+----------------------+

I'm sure there's optimization that can be done here, but this solution seems to guarantee the list order is kept.

edited Jul 18, 2020 at 18:42

answered Jul 17, 2020 at 19:17

critical_error

6,8263 gold badges16 silver badges17 bronze badges

4 Comments

GMB Over a year ago

Unfortunately, with string_split() there is no guarantee that the order of names in the new column will match the order of the ids in to_replace.

critical_error Over a year ago

I've tried every way I can and it always seems to return in the specified order, but yes, it's not guaranteed. I'll test an alternative when I have a moment.

sacse Over a year ago

Default ordering is ascending so it will work till the numbers in To_Replace are sorted.

critical_error Over a year ago

@GMB - I've added a possible solition to maintiaining your list order.

Collectives™ on Stack Overflow

Replace value in column based on another column

2 Answers 2

2 Comments

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related