2

Here is the dilemma:

I am doing a javascript effect. To do so I am pulling a node and it's children (including images) with .innerHTML. Then trying to parse it via the DOM. When it gets to the image tags it throws a parse error. When I alert the innerHTML I see that it is stripping the closing for the IMG tags.

I am not sure if the problem is with the parser or innerHTML. How can I take this node, grab the internal contents, parse it as XML?

Looks like a similar thing happened here: innerHTML removing closing slash from image tag

(This is the only page in the internet that I found that touches on this issue after almost two Days of searching.)

Here is the parse code I am using:

function loadXMLString(txt) {
    if (window.DOMParser) {
        parser=new DOMParser();
        xmlDoc=parser.parseFromString(txt,"text/xml");
    } else { // Internet Explorer
        xmlDoc=new ActiveXObject("Microsoft.XMLDOM");
        xmlDoc.async="false";
        xmlDoc.loadXML(txt); 
    }
    return xmlDoc;
}

The resolution was to change the mime type, but how do you do that with the javascript parser (both MS ActiveX and the other browser's standard)? What mime should I use?

Here is the DOM Element I am attempting to parse:

<div style="display:none" id="ItemsContainer" name="ItemsContainer">
    <SPAN>
       <a href="url1"><img src="1.jpg" alt="alt1" /></a>
       <a href="url2"><img src="2.jpg" alt="alt2" /></a>
       <a href="url3"><img src="3.png" alt="alt3" /></a>
       <a href="url4"><img src="4.jpg" alt="alt4" /></a>
    </SPAN>
</div>

If I change the tags to another name, like then it works. It seems that innerHTML is breaking the tag or that the parser can't parse IMG tags.

Please advise. Thanks In Advance!

8
  • Have you tried using the application/xhtml+xml mime type? that would be more suitable for your document than text/xml, as strict xml requires closing tags if I'm not mistaken. Commented Jan 6, 2012 at 21:26
  • 2
    Have you considered document.getElementById("ItemsContainer").getElementsByTagname("img") Commented Jan 6, 2012 at 21:38
  • mplungjan, I haven't. I will consider that. Commented Jan 6, 2012 at 21:40
  • Diodeus, the XML approach is because of what I do to it afterwards in the javascript. i take all the information and make it into some graphic and overlay effects. I wand I want to keep it in standard HTML for SEO purposes. Matt, if I do that, how do I change the mime-type for the ActiveX version of the parser? Commented Jan 6, 2012 at 21:42
  • Would client side processing to some innerHTML be interesting to SEO - the spider would need a pretty clever parser to make any sense of javascript manipulated stuff Commented Jan 6, 2012 at 22:24

2 Answers 2

1

IE automatically capitalizes Tag names (so becomes ) so I used txt.replace(/><\/a>/g, " /></a>").replace(/><\/A>/g, " /></A>")

Thanks to all who helped!

Sign up to request clarification or add additional context in comments.

Comments

0

I'm assuming your getting the "txt" variable by using innerHTML? I tested that in various browsers, and it does indeed strip the ending tag. Perhaps, before sending it to the function loadXMLString, you could add them back using regular expressions?

var re = new RegExp("(<img\b[^>]*)>", "g");
txt = txt.replace(re, "$1 />");

1 Comment

I used this philosophy to solve the problem, though not the exact code. Part of this is because IE automatically capitalizes Tag names (so <a> becomes <A>) so I used txt.replace(/><\/a>/g, " /></a>").replace(/><\/A>/g, " /></A>") to make it XML compliant. This made it work. Great! Now, this did leave one issue. In the ALT property of the IMG tag, IE innerHTML doesn't add the quote marks if it is only one word long, sigh... IE. I will work through this separately. Thanks to all who helped!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.