0

I got an xhtml file containing links with multiple parameters :

index.jsp?foo=bar&foo2=bar2&foo3=bar3.

Saxon 9.5 tries to interpret &foo2 as an entity and obviously fails. I cannot change my xml (it is a web page from internet), I could pre-process it with some regex but want to avoid programming if possible.

java -jar %SAXON_HOME%\saxon9he.jar -xsl:transfo.xsl -s:pageWeb.xml -o:result.html -dtd:off --recognize-uri-query-parameters:false

does not work. Is it possible without modifying the xml ?

Thank you

1 Answer 1

1

Well if you feed something to an XML parser that is not well-formed XML then the parser is going to reject it, that is why there is a specification. And Saxon simply relies on an XML parser to process its input documents and stylesheets.

If you have input that is not well-formed then you can try to use a different parser like TagSoup or the HTML5 parser, you need to tell Saxon to use it using the -x option e.g. java -jar %SAXON_HOME%\saxon9he.jar -x:org.ccil.cowan.tagsoup.Parser ... or java -jar %SAXON_HOME%\saxon9he.jar -x:nu.validator.htmlparser.sax.HtmlParser ....

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.