0

I am loading the below xml using LoadXml. I need to remove the entire <site> .... </site> node based on a condition using C#.

Followed:

  XmlDocument xdoc= new XmlDocument(); 
  xdoc.LoadXml(xmlpath); 
  string xml = xdoc.InnerXml.ToString();

  if(xml.Contains("href"+"\""+ "www.google.com" +"\"")
  {
  string removenode = "";  // If href=www.google.com is present in xml then remove the  entire node. Here providing the entire <site> .. </site>
  xml.Replace(removenode,"");
  }

It is not replacing the node with null

The XML is:

 <websites>
 <site>
 <a xmlns="http://www.w3.org/1999/xhtml" href="www.google.com"> Google </a>
 </site>
 <site>
 <a xmlns="http://www.w3.org/1999/xhtml" href="www.hotmail.com"> Hotmail </a>
 </site>
 </websites>

2 Answers 2

1

Here's an example that removes the site element containing any element with an href attribute containing www.google.com

using System.Diagnostics;
using System.Linq;
using System.Xml.Linq;

namespace ConsoleApplication6
{
    class Program
    {
        static void Main(string[] args)
        {
            const string frag = @" <websites>
 <site>
 <a xmlns=""http://www.w3.org/1999/xhtml"" href=""www.google.com""> Google </a>
 </site>
 <site>
 <a xmlns=""http://www.w3.org/1999/xhtml"" href=""www.hotmail.com""> Hotmail </a>
 </site>
 </websites>";

            var doc = XDocument.Parse(frag);

            //Locate all the elements that contain the attribute you're looking for
            var invalidEntries = doc.Document.Descendants().Where(x =>
            {
                //Get the href attribute from the element
                var hrefAttribute = x.Attribute("href");
                //Check to see if the attribute existed, and, if it did, if it has the value you're looking for
                return hrefAttribute != null && hrefAttribute.Value.Contains("www.google.com");
            });

            //Find the site elements that are the parents of the elements that contain bad entries
            var toRemove = invalidEntries.Select(x => x.Ancestors("site").First()).ToList();

            //For each of the site elements that should be removed, remove them
            foreach(var entry in toRemove)
            {
                entry.Remove();
            }

            Debugger.Break();
        }
    }
}
Sign up to request clarification or add additional context in comments.

2 Comments

Great it worked.. But my xml was also generating <websites xmlns="id:4457-211445-15151-151"> <site> <a xmlns="http://www.w3.org/1999/xhtml" href="www.google.com"> Google </a> </site> <site> <a xmlns="http://www.w3.org/1999/xhtml" href="www.hotmail.com"> Hotmail </a> </site> </websites> Then it fails saying that "Sequence contains no elements"
Replace var toRemove... with var toRemove = invalidEntries.Select(x => x.Ancestors().First(y => y.Name.LocalName == "site")).ToList(); it's the namespace that caused the issue
0

I think you need to use proper XML and XPath for this. Try following

XmlNodeList nl = xDoc.DocumentElement.SelectNodes("Site");

foreach(XmlNode n in nl)
{
    if(n.SelectSingleNode("a").Attributes("href").Value == "www.google.com")
    {
        n.ParentNode.RemoveChild(n);
    }

}

Hope that helps.

Milind

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.