3

I have a problem with a SAX xml parser. I want to parse a xml file which obviously is not valid (I get an ExpatParser$ParseException: At line 5, column 169: not well-formed (invalid token)). I know what is wrong, but the xml file ist not created by me...so I cant change it.

Now I want to handle that Error in my DefaultHandler. But neither error() nor fatalError() nor warning() is invoked...

Can I somehow interrupt the parsing process, tell the parser what to do with that piece of invalid xml and continue parsing???

Thanks, JPM

4
  • If I was you I would wite some sort of cleanup code that you pass the XML into before the SAX parser...or tell your source to fix their XML already if at all possible because it would take them all of three seconds for a minor syntax error. Commented Apr 28, 2011 at 22:11
  • Exactly same problem i have .... stackoverflow.com/questions/5673423/… Commented Apr 29, 2011 at 3:48
  • This is a bit like life giving you lemons; the SAX Parser cannot make apple juice with lemons. For the record this is the appropriate response to the guy that is giving you the lemons: "I don't want your damn lemons! What the hell are these?! Demand to see life's manager! Make life rue the day it thought it could give Cave Johnson lemons! Do you know who I am? I'm the man who's gonna burn your house down! WITH THE LEMONS! I'm gonna get my engineers to invent a combustible lemon that BURNS YOUR HOUSE DOWN!" (Portal 2) Commented Apr 29, 2011 at 8:05
  • stackoverflow.com/questions/4574710/… Commented May 3, 2011 at 1:58

1 Answer 1

1

I would guess that this SAXParseException is a fatal error that the SAX parser cannot recover from. In that case you probably need to fix up the bad tag before trying to parse it (as Robert suggests in his comment).

You might want to look into using a Java Regex to fix up the known badness in the XML, e.g.
Regex for quoting unquoted XML attributes

For the record, I am not advocating using regex to actually parse XML!

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks Dan and Robert, I guess I will do that. Since the xml is quite simple I maybe can parse it manually...I have to work on something else first. But I think one of those ways will solve my problem (and I still have hope we can get the source to invest the 2 seconds to fix there xml :-) ) Thanks, JPM

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.