0

I am getting org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x12) was found in the element content of the document. error on the client side. Can someone tell what is the regular expression using java.util.regex.Pattern to replace such characters so that I can exclude in the server side.

tried..

Pattern PATTERN = Pattern.compile("\0012");

but didn't work

1

2 Answers 2

1

Most "control characters" (<32 ASCII) are not legal in XML 1.0. Some of them are legal in XML 1.1. If your users expect them to be supported, you may want to make sure you're using a parser which can handle the newer Recommendation.

Sign up to request clarification or add additional context in comments.

Comments

0

When you need to look for some literal string with stuff in it that the regex parser is likely to have trouble with, use Pattern.quote() around the literal.

Also, you're using an octal encoding, not a unicode one - you forgot the u after \.

In this case:

Pattern PATTERN = Pattern.compile(Pattern.quote("\u0012"));

Note: I haven't tried this particular case!

1 Comment

@java1977 I thought it probably would, but I so often forget about regex specifics that I reach for quote() whenever there's any question. It's faster to write than to read everything, and unlike me, never makes mistakes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.