1

I allready saw other questions about the same problem but i still get an error. Hier is the small part of code where i try to modify exosting xml files. But it modifies some characters in text.

import org.jdom2.Document;
import org.jdom2.JDOMException;
import org.jdom2.input.SAXBuilder;
import org.jdom2.output.Format;
import org.jdom2.output.XMLOutputter;
import java.io.FileOutputStream;
import java.io.IOException;

public class ModyfyXml {

public static void main(String[] args) throws JDOMException, IOException {

    try {

        SAXBuilder sax = new SAXBuilder();
        Document doc = sax.build("F:\\c\\test.xml");

        XMLOutputter xmlOutput = new XMLOutputter();
        Format format = Format.getPrettyFormat();
        format.setEncoding("UTF-8");
        xmlOutput.setFormat(format);
        xmlOutput.output(doc, (new FileOutputStream("F:\\c\\test2.xml")));

    }catch (IOException io) {
        io.printStackTrace();
    } catch (JDOMException e) {
        e.printStackTrace();
    }
}}

Hier a small xml file that i try to modify (in this case just copy)

<?xml version="1.0" encoding="utf-8"?><page>
 䕶法喇嘛所居此處𡸁仲無妻室亦降神附體
</page> 

After program start i get the following:

<?xml version="1.0" encoding="UTF-8"?>
<page>䕶法喇嘛所居此處&#x21e01;仲無妻室亦降神附體</page>

Some chineese characters can't be right transformed

1 Answer 1

2

Dang I never noticed this bug in JDOM 2.

You will have the same results with any non-BMP character. You can try with the emoji mania of these latest years and see you get the same results.

It happens because of the escape strategy automatically set for UTF-whatever encodings. What it does is rather wrong.

That will be fixed if you replace the strategy with one that doesn't escape anything beside XML reserved chars:

format.setEscapeStrategy((c) -> false);
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.