XMLParser encoding problems

Question

public XMLParser(InputStream is) {
    try {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db;
        db = dbf.newDocumentBuilder();
        Document doc = db.parse(is);
        node = doc.getDocumentElement();
    } catch (Exception e) {
        DebugLog.log(e);
    }
}

The inputStream contains content like: "Hey there this is a ü character." The character 'ü' is a 'ü';

When reading the node's content System.out.println(node.getTextContent()) I receive "hey there this is a character." ü is cut of.

Community · Accepted Answer · 2017-05-23 11:56:17Z

0

Well, is this a valid document? Does it have encoding specified?-> http://www.w3schools.com/XML/xml_encoding.asp

Those might help:

Howto let the SAX parser determine the encoding from the xml declaration? http://www.coderanch.com/t/127052/XML/XML-parsers-encoding-byte-order

edited May 23, 2017 at 11:56

CommunityBot

11 silver badge

answered Sep 22, 2012 at 9:31

baranowb

6135 silver badges6 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Basic Coder Over a year ago

It's a HTML Webpage. ISO-8859-1

baranowb Over a year ago

What is the default charset on device/machine?

baranowb Over a year ago

Ach, just noticed tag. IIRC if not specified, the reader/parser assumes device( UTF-8 in this case ) encoding. You need to specify encoding( docs.oracle.com/javase/1.4.2/docs/api/java/io/…) or create some custom InputStream which peeks encoding.

Community · Accepted Answer · 2017-05-23 11:43:11Z

0

The Problem was the XML Entities and HTML Entities. I request a webpage which returns data with HTML Entities. I had to convert the HTML Entities to XML Entities and it worked!

Check this answer for some code

edited May 23, 2017 at 11:43

CommunityBot

11 silver badge

answered Sep 22, 2012 at 10:12

Basic Coder

11.5k7 gold badges47 silver badges79 bronze badges

Collectives™ on Stack Overflow

XMLParser encoding problems

2 Answers 2

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related