Java Strings know nothing of SGML / XML / HTML5 entities. é is such an entity. It works in web browsers inside HTML because in one of the DTDs, or the HTML5 spec, it's defined that é is the letter e with accent acute by mapping it to the corresponding unicode character entity é.
new String(someString.getBytes("UTF-8"), "UTF-8"); is a meaningless operation, it converts a String into bytes, with an encoding that can represent all meaningful characters, and converts it back into a String. It's the same thing as using someString directly, just you have a new object.
In order to get e with accent acute, you can do one of the following things:
- Directly type it, like
System.out.println("é");. This requires that your text editor and your Java compiler agree on the encoding of the source code file. If you're working in a project, it requires that everybody understands and agrees on a particular encoding. Recommended encoding these days certainly is UTF-8.
- Use the Unicode character number. In the case of e acute it would be
\u00e9.
P.S.: SGML / XML / HTML5 entities have nothing to do with UTF-8.
é?Stringinstance creation expression you've used is effectively a no-op.éand then you convert it back toStringinUTF-8encoding... no wonder you get the output you get.