Parse a String to a char?

Question

So in Java how is it possible to pase a String like ("\u000A") to a char? I got that String from a file, so i can't say something like this: char c = '\u000A';

Note that if the literal text \u000A is appearing when you dump a string you read from the file, it means that you did not read the file with the correct character set translation. — Hot Licks
– Hot Licks, Commented Nov 9, 2012 at 21:03
@HotLicks there is a chance on earth that OP needs to make that translation. — auselen
– auselen, Commented Nov 9, 2012 at 21:05
No i need that char(s) - it is the correct character set translation. But i am wondering about: String theString = "\u0029"; char theChar = theString.charAt(0); System.out.println(theChar); It Works - but shouldn't it return just a '\'? — Paket2001
– Paket2001, Commented Nov 9, 2012 at 21:09
There seems to be some confusion if the data is a string of length 1 (i.e. Unicode escape in Java string literal) or a string of length 6 (i.e. text read from a file) - take time to clarify. (Note: "\u000A".length() is 1, even though it is "6 characters" in the literal form.) — user166390
– user166390, Commented Nov 9, 2012 at 21:13

auselen · Accepted Answer · 2012-11-09 21:12:48Z

3

Check StringEscapeUtils

Escapes and unescapes Strings for Java, Java Script, HTML and XML.

This should work for what you want

char c = StringEscapeUtils.unescapeJava("\\u000A").charAt(0);

Double back slash is to encode "\u000A" in Java.

edited Nov 9, 2012 at 21:12

answered Nov 9, 2012 at 21:04

auselen

28.2k8 gold badges78 silver badges117 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Óscar López · Accepted Answer · 2012-11-09 21:01:12Z

1

Yes you can, this is perfectly valid code:

char c = '\uD840';

The example in your code, '\u000A' happens to be a non-valid Unicode character (probably a decoding problem when reading?). But all valid Unicode characters can be passed along between single quotes.

answered Nov 9, 2012 at 21:01

Óscar López

237k38 gold badges321 silver badges391 bronze badges

2 Comments

Hot Licks Over a year ago

(Keep in mind that 0x0A is newline, and hence will produce a new line if you attempt to print it. But it's a perfectly valid character.)

Hot Licks Over a year ago

It won't compile because it looks like a newline in the middle of a string constant. This is BECAUSE it's a valid Unicode character.

Marko Topolnik · Accepted Answer · 2012-11-09 21:28:53Z

0

Without extra libraries, you can use the fact that this is just the hexadecimal value of the char. This expression's value is that character:

(char)Integer.parseInt(input.substring(2, 16))

The technique works even for surrogate pairs because then you'd have two separate \u notations for the pair.

answered Nov 9, 2012 at 21:28

Marko Topolnik

201k31 gold badges336 silver badges455 bronze badges

Comments

Hot Licks · Accepted Answer · 2012-11-10 00:51:10Z

0

In answer to Oscar Lopez, this compiles and executes just fine:

public class TestUnicode {
    static public void main(String[] argv) {
        System.out.println("This is one line"); \u000A System.out.println("This is another line");
    }
}

The important thing to understand is that, in the Java compiler, \uXXXX characters are translated as the programs is being scanned, not as the characters are being inserted into a string literal (which is the norm for other \ escapes). Replace the \u000A above with \n and the program will not compile, but rather the compiler will report "Illegal character: \92" (and 92 is the decimal value for \).

answered Nov 10, 2012 at 0:51

Hot Licks

47.8k19 gold badges96 silver badges156 bronze badges

Collectives™ on Stack Overflow

Parse a String to a char?

4 Answers 4

Comments

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related