0

How can I remove whitespace on every instance of a particular node which I specify in C#? For example let's say that I have the following XML document:

<XML_Doc>

  <Record_1>
     <Name>Bob</Name>
     <ID_Number>12345</ID_Number>
     <Sample>
     </Sample>
  </Record_1>

  <Record_2>
     <Name>John</Name>
     <ID_Number>54321</ID_Number>
     <Sample>
     </Sample>
  </Record_2>

</XML_Doc>

What I would like is to take every instance of the <Sample> tag and change the formatting so it looks like this:

<XML_Doc>

  <Record_1>
     <Name>Bob</Name>
     <ID_Number>12345</ID_Number>
     <Sample></Sample>
  </Record_1>

  <Record_2>
     <Name>John</Name>
     <ID_Number>54321</ID_Number>
     <Sample></Sample>
  </Record_2>

</XML_Doc>

EDIT:

The other application which makes use of the XML file was not written by me and I cannot modify the structure of the XML document itself. I am writing a utility which parses the XML file and replaces each instance of the node I specify. The example where the <Sample></Sample> tag is on the same line is how the document is originally formatted and how I need it to be for the other application to be able to read it correctly.

7
  • It's not clear if you want to just have formatted XML or if you want to preserve whitepace and other characters in the XML. Your example seems to REMOVE the whitespace, not preserve it. Commented Sep 17, 2009 at 18:38
  • You are correct, I stated things backwards. Commented Sep 17, 2009 at 18:49
  • I dont see any issue with the original xml. The application should not behave differently if the closing node is on another line vs on the same line. Is the application using this xml refusing it? Commented Sep 17, 2009 at 18:50
  • I have no idea how the other application has been coded. It has been verified that the given the way it parses this XML file, it will not parse correctly when instances of that node are not on the same line. Commented Sep 17, 2009 at 19:04
  • If you are already building a parser that replaces each instance of a specific node, then surely all you need to do is to take every instance of that node you find, in string form, and String.Replace the whitespace in it to the empty string (using a Regex)? Commented Sep 17, 2009 at 20:38

4 Answers 4

1

If you're interested in preserving whitespace (ie: tabs, carriage returns and other formats) you can use the CDATA (unparsed character data).

<![CDATA[
]]>

However, if you just want to have an XML document that is formatted a certain way for aesthetic purposes , I would advise you to leave it alone.

To write a CDATA section into an XML Document use the following code:

XmlNode itemDescription = doc.CreateElement("description");
XmlCDataSection cdata = doc.CreateCDataSection("<P>hello world</P>");
itemDescription.AppendChild(cdata);
item.AppendChild(itemDescription);

This produces

<description><![CDATA[<P>hello world</P>]]></description>
Sign up to request clarification or add additional context in comments.

Comments

1

I think you want to set

xml.Settings.NewLineHandling = NewLineHandling.Entitize;

Comments

1

I would suggest using an xslt transform. Doing it this way allows you to have control over the stripping or preservation of white space.

MSDN has an article that addresses this issue, see: Controlling White Space with the DOM

Comments

0

Based on your comments, I think you may be able to use the XmlWriterSettings.NewLineHandling property to change how the text node handles the carriage returns. You'll have to experiment. I suspect the receiving application has some line parsing issue and is looking for a \r\n or something.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.