0

I have a string that contains a fair bit of XML, it's actually xml that describes a word document(document.xml). I want to simply replace a part of the string with an empty string effectivally removing it from the string. This sounds straight forward but I'm not getting the result I expect.

Here is what some of the XML looks like, this is just the first 10 lines:

<w:body xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
    <w:p w:rsidR="00CB3A3E" w:rsidP="00257CF7" w:rsidRDefault="008C1E91">
        <w:pPr>
            <w:pStyle w:val="Heading-Title" />
        </w:pPr>
        <w:r>
            <w:t>References</w:t>
        </w:r>
    </w:p>
    <w:sdt> 

As I said this is in a string. I simply try to replace <w:t>References</w:t> with an empty string. I am doing it like so:

//xmlBody is the string that is holding the xml
xmlBody.Replace("<w:t>References</w:t>", " ");

This is not working, the string is unaltered when I do this. What am I doing wrong? Any advice would be appreciated, thanks much!

1
  • 1
    Please try to use proper XML objects to manipulate XML. First you will not produce invalud XML this way and second you'll avoid asking "how to parse/search XML with regular expressions" when you find that <w:t> could be on separeate lines from text or something similar. Commented Aug 8, 2012 at 19:34

5 Answers 5

3
xmlBody = xmlBody.Replace("<w:t>References</w:t>", "");

The Replace function doesn't change the source string; it returns a new string with the changes made. In fact, C# strings cannot be changed. If you check out the documentation, it says that they're immutable.

Sign up to request clarification or add additional context in comments.

Comments

3

In C#, string is not mutable - once created, it cannot be changed. Replace returns a new instance of string. Therefore, you need to catch its return value:

xmlBody = xmlBody.Replace("<w:t>References</w:t>", " ");

As a sidenote, it is not considered a good practice to parse XML-based strings with regular expressions, because it's too fragile. Consider using XDocument or some such to remove elements you are after.

Comments

2

string.replace returns a new string, it doesn't change the original string

try

xmlBody = xmlBody.Replace("<w:t>References</w:t>", " ");

Comments

1

Replace isn't an inline replacement... it returns a new string with the replacement made.

Comments

0

Replace everything between <w.t> tags with an empty string:

xmlBody = Regex.Replace(xmlBody,
                        @"<w:t>[\s\S]*?</w:t>",
                        "<w:t></w:t>");

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.