2

I'm trying to create a char out of the utf code. I'm reading this code from a file which is a map of characters. All characters are specified by their UTF code.

0020 SPACE
0021 EXCLAMATION MARK
0022 QUOTATION MARK
.
.
.

After reading the code from the file, I end up with this code in a String. How can I convert this code(Stirng) to a char?

1
  • I don't understand the question. You have "0020" and would like the corresponding char, that's it ? Commented Jan 22, 2010 at 22:17

3 Answers 3

4

The codes are stored in hexadecimal so I think you want this:

String code = "0021";
char c = (char)Integer.parseInt(code, 16);
System.out.println("Code: " + code + " Character: " + c);

I assume that none of your character codes exceed the maximum value that can be stored in a char, i.e. the characters in the Basic Multilingual Plane. Because your data format appears to be zero padded up to a maximum length of 4 hexadecimal digits, I assume that all the characters you need to consider are in fact in the BMP.

If this is not the case, you will need a different solution. See Character.toChars(int).

Sign up to request clarification or add additional context in comments.

7 Comments

I already tried that. However once you get to a code like 000A Integer.parseInt fails.
Did you remember the 16? It means to treat the number as hexadecimal. If you omit this parameter it will not work.
My bad, i did not realize about the hexadecimal part.
@Mark Byers - casting to char like that will only work for characters in the basic multilingual plane - see Character.toChars(int).
@Mark Byers - you're probably right, but its hard to tell from the sample data. It is common to zero-pad the BMP code points: unicode.org/Public/UNIDATA/UnicodeData.txt
|
1

Parse it into an integer using Integer.parseInt(String, 16), then cast it to a char.

Comments

0

It looks like UTF-16. To create a String from these bytes, use:

new String(byte[]{0x00, 0x21}, "UTF-16")

This creates a String which holds the exclamation mark. The character is charAt(0).

EDIT

might not be the most performant approach but it works for other encodings as well...

EDIT

OK, there was a misunderstanding, the above code was not a solution but an example on how to faciliate the String constructor to create a String from a series of bytes in a special encoding. As it's an example, it looked static. Here's the runtime solution (knowing that especially the accepted solution fits much better - this one is just more general):

public char decodeUTF16(byte b1, byte b2) {
  return decode(new byte[]{b1, b2}).charAt(0);
}

public String decodeUTF16(byte[] bytes) {
  return decode(bytes, "UTF-16");
}

public String decode(byte[] bytes, String encoding) {
  return new String(bytes, encoding);
}

5 Comments

@Andreas_D: Downvote as the OP would like to have a runtime solution - yours is compile time, and for mentioning and abusing UTF-16. The OP already has the Unicode character point in hex, after it is decoded to int via the ParseInt() function) they don't need UTF-16 decoding. char c = (char) 0x0020 on the other hand would be a valuable contribution, I would suggest to edit that in.
It was an example. Just an example.
You are still abusing UTF-16. There is no UTF-16 in sight in the question!
So what's the given encoding? Could be UCS-2 as well. The point is, before you convert bytes into chars you have to think about the encoding. UTF-16 was a guess, at least he's stating 'UTF code' and it is not UTF-8. (although I'm pretty sure that he shows the unicode values)
That is the point! There is no encoding! There is only integer values so called Unicode codepoints. Hexadecimal is the encoding in this question if you want. The point of all Unicode encoding is to encode the Unicode codepoints (integer values) into a sequence of bytes; but it could be bird chirps or smoke signals as well as far as Unicode cares - the only meaningful aspect is to be able to recover the original sequence of integers.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.