0

Frédéric in java converted to Frédéric. However i need to pass the proper string to my client. How to achieve this in Java ?

Did tried

String a = "Frédéric";
String b = new String(a.getBytes(), "UTF-8");

However string b also contain same value as a. I am expecting string should able to store value as : Frédéric How to pass this value properly to client.

3
  • 1
    How do you pass the String to your client? What is the client written in and how is it processing your String? Commented Sep 3, 2014 at 6:36
  • 1
    Java strings are internally utf-16, and any String you create will be in that format. You can get the corresponding utf-8 bytevector like "Frédéric".getBytes(Charset.forName("UTF-8")) if that's what you want. Commented Sep 3, 2014 at 6:38
  • This is a duplicate of the numerous other questions about decoding input. Commented Sep 3, 2014 at 6:52

4 Answers 4

2

If I understand the question correctly, you're looking for a function that will repair strings that have been damaged by others' encoding mistakes?

Here's one that seems to work on the example you gave:

static String fix(String badInput) {
    byte[] bytes = badInput.getBytes(Charset.forName("cp1252"));
    return new String(bytes, Charset.forName("UTF-8"));
}

fix("Frédéric") == "Frédéric"
Sign up to request clarification or add additional context in comments.

1 Comment

static String fix(String badInput) { byte[] bytes = badInput.getBytes(Charset.forName("cp1252")); return new String(bytes, Charset.forName("UTF-8")); } fix("Frédéric") == "Frédéric"
0

The answer is quite complicated. See http://www.joelonsoftware.com/articles/Unicode.html for basic understanding. My first suggestion would be to save your Java file with utf-8. Default for Eclipse on Windows would be cp1252 which might be your problem. Hope I could help.

Comments

0

Find your language code here and use that.

String a = new String(yourString.getBytes(), YOUR_ENCODING);

You can also try:

String a = URLEncoder.encode(yourString, HTTP.YOUR_ENCODING);

Comments

0

If System.out.println("Frédéric") shows the garbled output on the console it is most likely that the encodings used in your sourcecode (seems to be UTF-8) is not the same as the one used by the compiler - which by default is the platform-encoding, so probably some flavor of ISO-8859. Try using javac -encoding UTF-8 to compile your source (or set the appropriate property of your build environment) and you should be OK.

If you are sending this to some other piece of client software it's most likely an encoding issue on the client-side.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.