1

So in Java how is it possible to pase a String like ("\u000A") to a char? I got that String from a file, so i can't say something like this: char c = '\u000A';

6
  • 1
    char theChar = theString.charAt(0); Commented Nov 9, 2012 at 21:01
  • 2
    Note that if the literal text \u000A is appearing when you dump a string you read from the file, it means that you did not read the file with the correct character set translation. Commented Nov 9, 2012 at 21:03
  • @HotLicks there is a chance on earth that OP needs to make that translation. Commented Nov 9, 2012 at 21:05
  • No i need that char(s) - it is the correct character set translation. But i am wondering about: String theString = "\u0029"; char theChar = theString.charAt(0); System.out.println(theChar); It Works - but shouldn't it return just a '\'? Commented Nov 9, 2012 at 21:09
  • 1
    There seems to be some confusion if the data is a string of length 1 (i.e. Unicode escape in Java string literal) or a string of length 6 (i.e. text read from a file) - take time to clarify. (Note: "\u000A".length() is 1, even though it is "6 characters" in the literal form.) Commented Nov 9, 2012 at 21:13

4 Answers 4

3

Check StringEscapeUtils

Escapes and unescapes Strings for Java, Java Script, HTML and XML.

This should work for what you want

char c = StringEscapeUtils.unescapeJava("\\u000A").charAt(0);

Double back slash is to encode "\u000A" in Java.

Sign up to request clarification or add additional context in comments.

Comments

1

Yes you can, this is perfectly valid code:

char c = '\uD840';

The example in your code, '\u000A' happens to be a non-valid Unicode character (probably a decoding problem when reading?). But all valid Unicode characters can be passed along between single quotes.

2 Comments

(Keep in mind that 0x0A is newline, and hence will produce a new line if you attempt to print it. But it's a perfectly valid character.)
It won't compile because it looks like a newline in the middle of a string constant. This is BECAUSE it's a valid Unicode character.
0

Without extra libraries, you can use the fact that this is just the hexadecimal value of the char. This expression's value is that character:

(char)Integer.parseInt(input.substring(2, 16))

The technique works even for surrogate pairs because then you'd have two separate \u notations for the pair.

Comments

0

In answer to Oscar Lopez, this compiles and executes just fine:

public class TestUnicode {
    static public void main(String[] argv) {
        System.out.println("This is one line"); \u000A System.out.println("This is another line");
    }
}

The important thing to understand is that, in the Java compiler, \uXXXX characters are translated as the programs is being scanned, not as the characters are being inserted into a string literal (which is the norm for other \ escapes). Replace the \u000A above with \n and the program will not compile, but rather the compiler will report "Illegal character: \92" (and 92 is the decimal value for \).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.