13

In my code, I want to remove the img tag which doesn't have src value. I am using HTMLAgilitypack's HtmlDocument object. I am finding the img which doesn't have src value and trying to remove it.. but it gives me error Collection was modified; enumeration operation may not execute. Can anyone help me for this? The code which I have used is:

foreach (HtmlNode node in doc.DocumentNode.DescendantNodes())
{
    if (node.Name.ToLower() == "img")
    {                            
           string src = node.Attributes["src"].Value;
           if (string.IsNullOrEmpty(src))
           {
               node.ParentNode.RemoveChild(node, false);    
           }
   }
   else
   {
             ..........// i am performing other operations on document
   }
}

4 Answers 4

28

It seems you're modifying the collection during the enumeration by using HtmlNode.RemoveChild method.

To fix this you need is to copy your nodes to a separate list/array by calling e.g. Enumerable.ToList<T>() or Enumerable.ToArray<T>().

var nodesToRemove = doc.DocumentNode
    .SelectNodes("//img[not(string-length(normalize-space(@src)))]")
    .ToList();

foreach (var node in nodesToRemove)
    node.Remove();

If I'm right, the problem will disappear.

Sign up to request clarification or add additional context in comments.

1 Comment

@Piya, glad to hear that. But I think by using one xpath expression is easier to make your code more readable (just select all the nodes to remove with one expression).
12

What I have done is:

    List<string> xpaths = new List<string>();
    foreach (HtmlNode node in doc.DocumentNode.DescendantNodes())
    {
                        if (node.Name.ToLower() == "img")
                        {
                            string src = node.Attributes["src"].Value;
                            if (string.IsNullOrEmpty(src))
                            {
                                xpaths.Add(node.XPath);
                                continue;
                            }
                        }
    }

    foreach (string xpath in xpaths)
    {
            doc.DocumentNode.SelectSingleNode(xpath).Remove();
    }

Comments

4
var emptyImages = doc.DocumentNode
 .Descendants("img")
 .Where(x => x.Attributes["src"] == null || x.Attributes["src"].Value == String.Empty)
 .Select(x => x.XPath)
 .ToList(); 

emptyImages.ForEach(xpath => { 
      var node = doc.DocumentNode.SelectSingleNode(xpath);
      if (node != null) { node.Remove(); }
    });

Comments

1
var emptyElements = doc.DocumentNode
    .Descendants("a")
    .Where(x => x.Attributes["src"] == null || x.Attributes["src"].Value == String.Empty)
    .ToList();

emptyElements.ForEach(node => {
    if (node != null){ node.Remove();}
});

1 Comment

While this code may solve the question, including an explanation of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please edit your answer to add explanations and give an indication of what limitations and assumptions apply.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.