3

I have some characters from the Unicode Extension B Chinese/Japanese/Korean set in my XML:

𠀀𠀁𠀂𠀃𠀄𪛔𪛕𪛖 

But when I use streamReader.getText() it returns:

Does anyone know if Java's XMLStreamReader's encoding scheme for unicode characters can be changed?

It works with common East Asian characters, just not with the ones in Unicode Extension B.

2
  • 1
    How are you constructing the XMLStreamReader? What do XMLStreamReader#getEncoding() and XMLStreamReader#getCharacterEncodingScheme() return? What encoding is the XML actually stored with? Commented Jan 30, 2013 at 20:46
  • Hi Matt, the XML is utf-8 and XMLStreamReader#getCharacterEncodingScheme is utf-8 as well. XMLStreamReader#getEncoding is null The XMLStreamReader is created by XMLInputFactory.createXMlStreamReader() Commented Jan 30, 2013 at 21:09

1 Answer 1

1

when create XML Stream Reader, you can specify the encoding as UTF-8. Like the API below

abstract XMLStreamReader createXMLStreamReader(InputStream stream, String encoding)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.