0

I am trying to parse content using DocumentBuilder.

<html>
<head>
<meta charset="utf-8" />
<title>Test</title>
</head>
<body>
<img height="" src="google.gif?<>" />
</body>
</html>

I am getting an exception while parsing it that src cannot contain <. I need to parse it as I am applying XSL.

Is there any way to do it. as of now, I am first unescaping it parsing using DocumentBuilder and escaping it again.

I am retrieving the above XML in String format from Database. Now when I am trying to parse it using DocumentBuilder I am getting an exception that src cannot contain <. I tried to escape it using StringEscapeUtils.EscapeHtml but it is escaping the complete String and again DocumentBuilder is not able to parse it. Please let me know how to parse src only from HTML as I am not able to accomplish it.

2
  • This will be useful for xml encoding link Commented Aug 11, 2016 at 13:29
  • 1
    XML parsers are there to parse XML. This input isn't XML. You're going to have to repair it. Commented Aug 11, 2016 at 16:13

1 Answer 1

4

These symbols <> are predefined entities used for tags in XML. You have to use the special notation. Read more on Wikipedia.

  • &gt; for >
  • &lt; for <
  • &quot; for "
  • &apos; for '
  • &amp; for &

Your code would be finally:

<img height="" src="google.gif?&lt;&gt;" />
Sign up to request clarification or add additional context in comments.

3 Comments

I want to parse with &#60; &#62;
I am retrieving the above html in String format from Database. Now when I am trying to parsing it using documentbuilder I am getting exception that src cannot contain <. I tried to escape it using StringEscapeUtils.EscapeHTml but it is eascaping the complete String and again documentBuildet is not able to parse it. Please let me know how to parse src only from HTml as I am not able to accomplish it.
If I am replacing it with the aboce said characters I am getting the following exception org.xml.sax.SAXParseException: Reference is not allowed in prolog.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.