saving Java file in UTF-8

Question

When I run this program it gives me a '?' for the unicode code-point \u0508. This is because the default windows character encoding CP-1252 is unable to map this code-point.

But when I save this file in Eclipse as 'Text file encoding' = UTF-8 and run this program it gives me the correct output AԈC.

why does this work? I mean the java file is saved as UTF-8 but still the underlying windows OS encoding is CP-1252. My question is similar to, when I try to read a text file in UTF-16 which was originally written in UTF-8, the output is wierd with different box symbols.

public class e {
public static void main(String[] args) {
    System.out.println(System.getProperty("file.encoding"));
    String original = new String("A" + "\u0508" + "C");
    try {
        System.out.println("original = " + original);
    } catch (Exception e) {
        e.printStackTrace();
    }
}
}

How do you run the application? In the Eclipse console, or through the Windows Terminal (CMD), or maybe even something else? — Martijn Courteaux
– Martijn Courteaux, Commented Dec 29, 2012 at 0:08

Martijn Courteaux · Accepted Answer · 2012-12-29 00:18:32Z

3

Saving the Java source file either as UTF-8 or Windows-1252 shouldn't make any difference, because both encodings encode all the ASCII code-points the same way. And your source file is only using ASCII characters.

So, that you should try to find the bug somewhere else. I suggest to redo the steps you did with care and do the tests over.

answered Dec 29, 2012 at 0:18

Martijn Courteaux

69.1k48 gold badges202 silver badges297 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

bmargulies · Accepted Answer · 2012-12-29 00:43:32Z

2

The issue is the setting of file.encoding when you run the program, and the destination of System.out. If System.out is an eclipse console, it may well be set to be UTF-8 eclipse console. If it's just a Windows DOS box, it is a CP1252 code page, and will only display ? in this case.

answered Dec 29, 2012 at 0:43

bmargulies

101k40 gold badges196 silver badges327 bronze badges

2 Comments

user547453 Over a year ago

you are right....when I try to run this program from the command prompt it gives me the same '?' symbol...thanks for clarifying my question.

user547453 Over a year ago

so...if I ever want to save this character to a file, how can I do it? I tried using System.out.println("roundTrip = " + new String(original.getBytes("UTF8"),"UTF8")); and it still gives me '?'

Collectives™ on Stack Overflow

saving Java file in UTF-8

2 Answers 2

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related