2

Problem

My MSSQL database has a table records with an XML column data, which is used like this:

<record id="1">
  <field tag="DI" occ="1" lang="de-DE">Höhe</field>
  <field tag="DI" occ="1" lang="en-GB">height</field>
  <field tag="WA">173</field>
  <field tag="EE">cm</field>
  <field tag="DI" occ="2" lang="de-DE">Breite</field>
  <field tag="DI" occ="2" lang="en-GB">width</field>
  <field tag="WA">55</field>
  <field tag="EE">cm</field>
</record>

I want to update all rows in the table at once, replacing /record/field/@lang by en-US where it is en-GB at the moment (all elements with that attribute value).

Already tried something like...

declare @i int;
declare @xml xml;
set @xml = (select top(1) [data] from [my-database].[dbo].[records]);
select @i = @xml.value('count(/record/field[lang="en-GB"])', 'int')

while @i > 0
begin
    set @xml.modify('
            replace value of
                (/record/field[lang="en-GB"]/text())[1]
            with "en-US"
    ')

    set @i = @i - 1
end

select @xml;

... but it returns the data unchanged and only works if a single row is selected. How can I make this work and update all rows in one go?

Solution

I ended up using XQuery as suggested by Shnugo. My slightly generalized query looks like this:

UPDATE [my-database].[dbo].[records] SET data = data.query(N'
    <record>
    {
        for $attr in /record/@*
        return $attr
    }
    {
        for $fld in /record/*
        return
        if (local-name($fld) = "field")
        then <field>
        {
            for $attr in $fld/@*
            return
            if (local-name($attr) = "lang" and $attr = "en-GB")
            then attribute lang {"en-US"}
            else $attr
        }
        {$fld/node()}
        </field>
        else $fld
    }
    </record>
')
FROM [my-database].[dbo].[records]
WHERE [data].exist('/record/field[@lang="en-GB"]') = 1;
SELECT * FROM [my-database].[dbo].[records]

The name of the top most node <record> needs to be hard-coded it seems, because MSSQL server doesn't support dynamic element names (nor attribute names). Its attributes as well as all child elements other than <field> are copied automatically with above code.

1
  • 1
    Great question, great solution 😀 Commented Apr 25, 2017 at 13:35

4 Answers 4

2

An ugly solution without xquery, xpath...:

  DECLARE @xml XML = N'<record id="1">
  <field tag="DI" occ="1" lang="de-DE">Höhe</field>
  <field tag="DI" occ="1" lang="en-GB">height</field>
  <field tag="WA">173</field>
  <field tag="EE">cm</field>
  <field tag="DI" occ="2" lang="de-DE">Breite</field>
  <field tag="DI" occ="2" lang="en-GB">width</field>
  <field tag="WA">55</field>
  <field tag="EE">cm</field>
</record>'

SET @xml = REPLACE(CAST(@xml AS nvarchar(max)), '"en-GB"', '"en-US"')

SELECT @xml

And use modify()

DECLARE @nodeCount int
DECLARE @i int

SET @i = 1

SELECT @nodeCount = @xml.value('count(/record/field/@lang)','int') 

WHILE (@i <= @nodeCount)
BEGIN
    Set @xml.modify('replace value of (/record/field/@lang)[.="en-GB"][1] with "en-US"')
    SET @i = @i + 1
END

SELECT @xml

Demo link: Rextester

Sign up to request clarification or add additional context in comments.

3 Comments

Can you expand your answer on how to use the second example on multiple records in an existing table? (You should it for a single record, that you defined in-line)
This actually worked for me. Very simple solution... Thanks!
How is this ugly? Compared to the accepted answer it is far simpler, more easily understood for most developers and shorter. You are still using XPath by the way.
1

I add this as a second answer, as it follows a completely different approach. The following code will use .query() with a FLWOR query to read the XML as-is but change the attribute lang, when the content is en_GB:

DECLARE @xml XML=
N'<record id="1">
  <field tag="DI" occ="1" lang="de-DE">Höhe</field>
  <field tag="DI" occ="1" lang="en-GB">height</field>
  <field tag="WA">173</field>
  <field tag="EE">cm</field>
  <field tag="DI" occ="2" lang="de-DE">Breite</field>
  <field tag="DI" occ="2" lang="en-GB">width</field>
  <field tag="WA">55</field>
  <field tag="EE">cm</field>
</record>';

The query

SELECT @xml.query
(N'
    <record id="{/record/@id}">
    {
        for $fld in /record/field
        return <field>
        {
            for $attr in $fld/@*
            return
            if(local-name($attr)="lang" and $attr="en-GB") then attribute lang {"en-US"}
            else $attr
        }
        {$fld/text()}
        </field>
    }
    </record>
')

The result

<record id="1">
  <field tag="DI" occ="1" lang="de-DE">Höhe</field>
  <field tag="DI" occ="1" lang="en-US">height</field>
  <field tag="WA">173</field>
  <field tag="EE">cm</field>
  <field tag="DI" occ="2" lang="de-DE">Breite</field>
  <field tag="DI" occ="2" lang="en-US">width</field>
  <field tag="WA">55</field>
  <field tag="EE">cm</field>
</record>

UPDATE: This works with all table's rows too:

Try this to update a full table at once:

DECLARE @tbl TABLE(ID INT IDENTITY,YourXml XML)
INSERT INTO @tbl VALUES
(
N'<record id="1">
  <field tag="DI" occ="1" lang="de-DE">Höhe</field>
  <field tag="DI" occ="1" lang="en-GB">height</field>
  <field tag="WA">173</field>
  <field tag="EE">cm</field>
  <field tag="DI" occ="2" lang="de-DE">Breite</field>
  <field tag="DI" occ="2" lang="en-GB">width</field>
  <field tag="WA">55</field>
  <field tag="EE">cm</field>
</record>'
)
,(
N'<record id="2">
  <field tag="DI" occ="1" lang="de-DE">Höhe</field>
  <field tag="DI" occ="1" lang="en-GB">height</field>
  <field tag="WA">173</field>
  <field tag="EE">cm</field>
  <field tag="DI" occ="2" lang="de-DE">Breite</field>
  <field tag="DI" occ="2" lang="en-GB">width</field>
  <field tag="WA">55</field>
  <field tag="EE">cm</field>
</record>'
);

UPDATE @tbl SET YourXml=YourXml.query
(N'
    <record id="{/record/@id}">
    {
        for $fld in /record/field
        return <field>
        {
            for $attr in $fld/@*
            return
            if(local-name($attr)="lang" and $attr="en-GB") then attribute lang {"en-US"}
            else $attr
        }
        {$fld/text()}
        </field>
    }
    </record>
');

SELECT * FROM @tbl

6 Comments

Definitely the cleanest solution. Much appreciated! I tried to generalize the code as much as possible and appended my final solution to my question. Let me know if there is anything I could improve, such as not hardcoding <record> as top most element.
@CoDEmanX I'm glad to read this! Your own solution looks great 😁 Just keep in mind, that this might rearrange your XML.
It does change the order, but XML does not guarantee it anyway. Each repeatable field has an occurrence number in my data to retain its relative position within its group. The order of <field> and adjacent elements does not matter to me. Thanks again!
@CoDEmanX Well, this is not entirely true. XML is - by definition - an ordered set of elements. The order is an implicit part of the document. This is not true for attributes. Furthermore you can enforce the order using an XSD with xs:sequence. And there are canonical XML documents, where the order of attributes must be in descending alphabetical order. This can be very important for signatures...
Oh thanks, looks like I was hornswoggled by false information. I changed my solution to iterate over all fields and enter a then branch with some special handling if the name is field, otherwise return the elements unchanged. Element order is retained this way, should be as generic as possible now 👍
|
1

I'd avoid the cast to a string type due to side effects (but this might be the easiest approach, especially if the XML might include other nodes, which you do not show in your example...)

I'd avoid loops too.

My approach was to shredd and re-create the XML:

DECLARE @xml XML=
N'<record id="1">
  <field tag="DI" occ="1" lang="de-DE">Höhe</field>
  <field tag="DI" occ="1" lang="en-GB">height</field>
  <field tag="WA">173</field>
  <field tag="EE">cm</field>
  <field tag="DI" occ="2" lang="de-DE">Breite</field>
  <field tag="DI" occ="2" lang="en-GB">width</field>
  <field tag="WA">55</field>
  <field tag="EE">cm</field>
</record>';

--The query will read all field's values and rebuild the XML with the changed language

WITH Shredded AS
(
    SELECT fld.value(N'@tag',N'nvarchar(max)') AS tag
          ,fld.value(N'@occ',N'int') AS occ
          ,fld.value(N'@lang',N'nvarchar(max)') AS lang
          ,fld.value(N'(./text())[1]',N'nvarchar(max)') AS content
    FROM @xml.nodes(N'/record/field') AS A(fld)
)
SELECT @xml.value(N'(/record/@id)[1]',N'int') AS [@id]
     ,(
        SELECT   tag AS [@tag]
                ,occ AS [@occ]
                ,CASE WHEN lang='en-GB' THEN 'en_US' ELSE lang END AS [@lang]
                ,content AS [*]
        FROM Shredded
        FOR XML PATH('field'),TYPE
      ) AS [*]
FOR XML PATH(N'record')

The result

<record id="1">
  <field tag="DI" occ="1" lang="de-DE">Höhe</field>
  <field tag="DI" occ="1" lang="en_US">height</field>
  <field tag="WA">173</field>
  <field tag="EE">cm</field>
  <field tag="DI" occ="2" lang="de-DE">Breite</field>
  <field tag="DI" occ="2" lang="en_US">width</field>
  <field tag="WA">55</field>
  <field tag="EE">cm</field>
</record>

3 Comments

Thanks. It would lose all nodes however, that are not shredded/re-created explicitly, which is undesired in my case.
@CoDEmanX, yeah, that's why I wrote especially if the XML might include other nodes... Any structured approach can get into troubles, if you do not know the full structure in advance. My second answer with the FLWOR XQuery is more flexible with this... But the simple cast and replace approach might be the best for you...
The structure is dynamic in my use case, so the structured approach isn't viable. But I really appreciate that you showed this option too.
1

Yeah, unfortunately, the replace value of statement only updates one node at a time. So in your case, a quick and dirty replace would be the easiest to write (and, with luck, maybe even the fastest to run):

update t set [data] = cast(
  replace(cast(t.[data] as nvarchar(max)), N' lang="en-GB"', N' lang="en-US"')
as xml)
from dbo.Records t
where t.[data].exist('/record/field[@lang="en-GB"]') = 1;

If you XML schema varies such as that there is no guarantee that the /record node will be always at the top level, you might want to modify the filter as such:

where t.[data].exist('//record/field[@lang="en-GB"]') = 1;

Another approach would be to use a FLWOR statement, but if the XML structure varies significantly and contains other unpredictable nodes, it becomes rather difficult not to lose anything accidentally. Which in turn will lead to poorer performance. For this approach to be viable, your XML schema has to be very stable.

3 Comments

This would replace "en-GB" in the entire XML including text nodes if I see that correctly, which seems a little insecure. It looks like the solution with the least amount of code however. Thanks.
@CoDEmanX On string level you can replace lang="en-GB" with lang="en_US" (with a leading blank) to minimize this danger...
@CoDEmanX, yes, insecure, though Shnugo's remark makes it more viable. I have updated my answer accordingly.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.