“UTF-8” encoding is not working in java build [closed]

Question

Closed. This question needs debugging details. It is not currently accepting answers.

Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.

Closed 8 years ago.

Improve this question

I saved my Java source file specifying it's encoding type as UTF-8 in my eclipse. It is working fine in eclipse. When I create a build with maven & execute it in my system Unicode characters are not working.

This is my code :

    byte[] bytes = new byte[dataLength];
    buffer.readBytes(bytes);
    String s = new String(bytes, Charset.forName("UTF-8"));
    System.out.println(s);

Eclipse console & windows console screenshot attached. Expecting eclipse output in other systems(windows command prompt, powershell window, Linux machine, etc.,).

What is the value of system property file.encoding when running in the console? How do you read the data, how do you print? Show some code. — Mark Rotteveel
– Mark Rotteveel, Commented Sep 22, 2017 at 9:55
Probably your PowerShell encoding is not UTF-8. Try to set its encoding as UTF-8: run command [Console]::OutputEncoding = [Text.UTF8Encoding]::UTF8 and then run your java program. — Mykhailo Hodovaniuk
– Mykhailo Hodovaniuk, Commented Sep 22, 2017 at 9:58
It is the maven-compiler-plugin that has to know the encoding to compile with too. This is a pom setting. Errors in the console cannot be trusted to be real errors, as there typically might be another platform encoding set. — Joop Eggen
– Joop Eggen, Commented Sep 22, 2017 at 9:59
@MarkRotteveel getting data from server and printing it in console. I have updated question with my sample code. — Prasath
– Prasath, Commented Sep 22, 2017 at 10:00
@Prasath all you've done in the Eclipse settings is set the source encoding to UTF-8. That will make no difference whatsoever to your program, unless you have non-ASCII characters in your source code, e.g. if you have a £ sign in a variable name. You haven't changed the system default encoding. — Klitos Kyriacou
– Klitos Kyriacou, Commented Sep 22, 2017 at 10:07

Ortwin Angermeier · Accepted Answer · 2017-09-22 10:58:34Z

0

You could use the Console class for that.The following code could give you some inspiration:

public class Foo {

    public static void main(String[] args) throws IOException {
        String s = "öäü";
        write(s);
    }

    private static void write(String s) throws IOException {
        String encoding = new OutputStreamWriter(System.out).getEncoding();
        Console console = System.console();
        if (console != null) {
            // if there is a console attached to the jvm, use it.
            System.out.println("Using encoding " + encoding + " (Console)");
            try (PrintWriter writer = console.writer()) {
                writer.write(s);
                writer.flush();
            }
        } else {
            // fall back to "normal" system out
            System.out.println("Using encoding " + encoding + " (System out)");
            System.out.print(s);
        }
    }
}

Tested on Windows 10(poowershell), Ubuntu 16.04(bash) with default settings. Also works from within IntelliJ (Windows and Linux).

answered Sep 22, 2017 at 10:58

Ortwin Angermeier

6,2312 gold badges37 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Prasath Over a year ago

I tried your code. Still not working.

Ortwin Angermeier Over a year ago

Mhh strange, it works on my side, just double checked. Can you post a running sample where it is not working?

Prasath Over a year ago

this question is closed. Issue is in Power shell window execution. Found the solution.

Larsen Over a year ago

@Prasath Then why don't you post your solution?

matt · Accepted Answer · 2017-09-25 14:45:13Z

From what I can tell, you either have the wrong character, which I don't think is the case, or you are trying to display it on a terminal that doesn't handle the character. I have written a short test to separate the issues.

public static void main(String[] args){
    String testA = "ֆޘᜅᾮ";
    String testB = "\u0586\u0798\u1705\u1FAE";

    System.out.println(testA.equals(testB));
    System.out.println(testA);
    System.out.println(testB);

    try(BufferedWriter check = Files.newBufferedWriter(
            Paths.get("uni-test.txt"),
            StandardCharsets.UTF_8,
            StandardOpenOption.CREATE,
            StandardOpenOption.TRUNCATE_EXISTING) ){
        check.write(testA);
        check.write("\n");
        check.write(testB);
        check.close();
    } catch(IOException ioc){

    }

}

You could replace the values with the characters you want.

The first line should print out true if the string is the actual string you want. After that it is a matter of displaying the characters. For example if I open the text file with less then half of them are broken. If I open it with firefox, then I see all four characters, but some are wonky. You'll need a font that has characters for the corresponding unicode value.

One thing you can do is open the file in a word processor and select a font that displays the characters you want correctly.

As suggested by the OP, including the -Dfile.encoding=UTF8causes the characters to display correctly when using System.out.println. Similar to this question which changes the encoding of System.out.

Collectives™ on Stack Overflow

“UTF-8” encoding is not working in java build [closed]

2 Answers 2

4 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Linked

Related