585

I'm trying to use a constant instead of a string literal in this piece of code:

new InputStreamReader(new FileInputStream(file), "UTF-8")

"UTF-8" appears in the code rather often, and would be much better to refer to some static final variable instead. Do you know where I can find such a variable in JDK?

3
  • 11
    See this question. Commented Jul 14, 2011 at 18:51
  • 2
    Note: if you are already on Java 7, use Files.newBufferedWriter(Path path, Charset cs) from NIO. Commented Aug 8, 2018 at 14:55
  • 3
    That's some really bad advice from your link. He wants you to make a wrapper class for every possible string constant you might use? Commented Jun 25, 2020 at 20:15

11 Answers 11

1003

In Java 1.7+, the class java.nio.charset.StandardCharsets defines constants for Charset including UTF_8.

import java.nio.charset.StandardCharsets;

...

StandardCharsets.UTF_8.name();

For Android: minSdk 19

Sign up to request clarification or add additional context in comments.

8 Comments

do you use .toString() on that?
.toString() will work but the proper function is .name(). 99.9% toString is not the answer.
btw .displayName() will also work unless it is overridden for localization as intended.
You don't really need to call name() at all. You can directly pass the Charset object into the InputStreamReader constructor.
And there are other libs out there which do require a String, perhaps because of legacy reasons. In such cases, I keep a Charset object around, typically derived from StandardCharsets, and use name() if needed.
|
149

Now I use org.apache.commons.lang3.CharEncoding.UTF_8 constant from commons-lang.

4 Comments

For those using Lang 3.0: org.apache.commons.lang3.CharEncoding.UTF_8. (Note "lang3").
If you're using Java 1.7, see @Roger's answer below since it's part of the standard library.
P.S. "@Roger's answer below" is now @Roger's answer above. ☝
That class is deprecated since Java 7 introduce java.nio.charset.StandardCharsets
73

The Google Guava library (which I'd highly recommend anyway, if you're doing work in Java) has a Charsets class with static fields like Charsets.UTF_8, Charsets.UTF_16, etc.

Since Java 7 you should just use java.nio.charset.StandardCharsets instead for comparable constants.

Note that these constants aren't strings, they're actual Charset instances. All standard APIs that take a charset name also have an overload that take a Charset object which you should use instead.

8 Comments

So, should be Charsets.UTF_8.name()?
@kilaka Yeah use name() instead of getDisplayName() since name() is final and getDisplayName() is not
@Buffalo: Please read my answer again: it recommends using java.nio.charset.StandardCharsets when possible, which is not third party code. Additionally, the Guava Charsets definitions are not "constantly modified" and AFAIK have never broken backwards compatibility, so I don't think your criticism is warranted.
@Buffalo: That's as it may be, but I doubt your issues had anything to do with the Charsets class. If you want to complain about Guava, that's fine, but this is not the place for those complaints.
Please do not include a multi-megabyte library to get one string constant.
|
51

In case this page comes up in someones web search, as of Java 1.7 you can now use java.nio.charset.StandardCharsets to get access to constant definitions of standard charsets.

4 Comments

I have been trying to use this but it does not seem to work. 'Charset.defaultCharset());' seems to work after including 'java.nio.charset.*' but I can't seem to explicitly refer to UTF8 when I am trying to use 'File.readAllLines'.
@Roger What seems to be the problem? From what I can see you can just call: Files.readAllLines(Paths.get("path-to-some-file"), StandardCharsets.UTF_8);
I don't know what the problem was, but it worked for me after changing something which I can't remember.
^^^ You probably had to change the target platform in the IDE. If 1.6 was your latest JDK when you installed the IDE, it probably picked it as the default & kept it as the default long after you'd updated both the IDE and JDK themselves in-place.
10

This constant is available (among others as: UTF-16, US-ASCII, etc.) in the class org.apache.commons.codec.CharEncoding as well.

Comments

10

In Java 1.7+

Do not use "UTF-8" string, instead use Charset type parameter:

import java.nio.charset.StandardCharsets

...

new InputStreamReader(new FileInputStream(file), StandardCharsets.UTF_8);

Comments

9

There are none (at least in the standard Java library). Character sets vary from platform to platform so there isn't a standard list of them in Java.

There are some 3rd party libraries which contain these constants though. One of these is Guava (Google core libraries): http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/base/Charsets.html

3 Comments

It took me a second to catch on to this... Guava's Charsets constants are (no surprise) Charsets, not Strings. InputStreamReader has another constructor that takes a Charset rather than a string. If you really need the string, it's e.g. Charsets.UTF_8.name().
Character sets do may vary from platform to platform, but UTF-8 is guaranteed to exist.
All charsets defined in StandardCharsets are guaranteed to exist in every Java implementation on every platform.
8

You can use Charset.defaultCharset() API or file.encoding property.

But if you want your own constant, you'll need to define it yourself.

1 Comment

The default charset is usually determinded by the OS and locale settings, I don't think there is any guarantee that it remains the same for multiple java invocations. So this is no replacement for a constant sepcifying "utf-8".
5

If you are using OkHttp for Java/Android you can use the following constant:

import com.squareup.okhttp.internal.Util;

Util.UTF_8; // Charset
Util.UTF_8.name(); // String

1 Comment

it's removed from OkHttp, so next way is: Charset.forName("UTF-8").name() when you need support for lower Android than API 19+ otherwise you can use: StandardCharsets.UTF_8.name()
5

Constant definitions for the standard. These charsets are guaranteed to be available on every implementation of the Java platform. since 1.7

 package java.nio.charset;
 Charset utf8 = StandardCharsets.UTF_8;

Comments

4

Class org.apache.commons.lang3.CharEncoding.UTF_8 is deprecated after Java 7 introduced java.nio.charset.StandardCharsets

  • @see JRE character encoding names
  • @since 2.1
  • @deprecated Java 7 introduced {@link java.nio.charset.StandardCharsets}, which defines these constants as
  • {@link Charset} objects. Use {@link Charset#name()} to get the string values provided in this class.
  • This class will be removed in a future release.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.