0

When I try to print a UTF-16 string in JSP, specifically Hebrew, it ends up showing up as HTML encoding (&#xxxx).

This problem occurs because I print an array of variables into the web page and then parse them. The variables are all UTF-16 strings, but once the servlet prints the variables, it becomes translated to HTML encoding. Is there any way to get rid of the encoding?

Thanks in advance

Edit for a bit more background:

The JSP that I'm printing is not the entirety of the page. It's used in a manner I don't quite understand by a server app which prints the JSPs output into its built in page. This isn't a frame or anything like that. It's just redirected output.

3
  • When you say "showing up as HTML encoding (&#xxxx)", do you mean that on whe wire it's really &#xxxx ? Commented May 4, 2010 at 10:38
  • No, I mean that it's originally a real string, (actual letters) but they are translated by the servlet (I assume) to &#xxxx Commented May 4, 2010 at 10:57
  • 1
    That's not default JSP behaviour. You're using some MVC or templating framework you aren't aware of and not using it the right way. Ask your manager/architect/colleagues what framework it is. I would by the way also just communicate this problem to them, they may know the cause and solution. This by the way smells much like a XML based MVC framework like JSF. Commented May 4, 2010 at 11:59

1 Answer 1

1

See if adding

contentType="text/html; charset=UTF-8"

to the JSP header (<%@page ...) helps.

Sign up to request clarification or add additional context in comments.

5 Comments

It didn't work. See my edit for a bit more clarification on the situation.
Also, I tried using UTF-16 instead of UTF-8 for the hell of it, but that completely botched everything up. I assume that wasn't meant to work.
@Ori: I'm not sure if UTF-16 is supported on the web at all. Presumably it is, but I never saw it used. Without details of how the page is generated it's hard to say more. Look what and where calls setContentType() on HttpServletResponse object, it might be "wrong" (not UTF-8) or missing. Also search for setHeader() call for Content-Type header — I'm not sure what of those takes precedence.
No, character encodings has nothing to do with instantly converting non-ASCII characters to XML entities. A different character encoding would only cause the XML entities being numbered differently :)
@BalusC: Isn't entity number the UCS codepoint? I.e. isn't it independent from encoding?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.