5

I am using HttpClient (version 3.1) on several different (but apparently identical) computers to read a UTF-8 encoded JSON data from a URL.

On all the machines, save one, it works fine. I have some Spanish language words and they come through with accents and tildes intact.

One computer stubbornly refuses to cooperate. It is apparently treating the data as ISO-8859-1, despite a Content-Type: application/json;charset=utf-8 header.

If I use curl to access that URL from that computer, it works correctly. On every other computer, both curl and my HttpClient-based program work correctly.

I did an md5sum on the common-httpclient.jar file on each machine: the same.

Is there some setting, deep in Linux, that might be different and be messing with me? Any other theories, or even places to look?

EDIT: some people asked for more details.

Originally I had the problem deep in the bowels of a complex Tomcat app, but I lightly adapted the sample to just retrieve the URL in question, and (fortunately) had the same problem.

These are Linux 2.6 machines running jdk1.7.0_45.

An env command yields a bunch of variables. The only one that looks remotely on point is LANG=en_US.UTF-8.

8
  • may you explain a little more about the machine on which it isn't working, that's a linux? which one? Commented May 21, 2014 at 6:08
  • Can you clarify the setup? Is problem with command line client that uses httpclient to access some URL? What locale system environmental variables are set on this computer? Commented May 21, 2014 at 6:30
  • @caramba answer in edit. Commented May 21, 2014 at 10:22
  • How are you then viewing the results? Can you post a short but complete program which demonstrates the problem? (You talk about the sample in the HttpClient docs - are you saying you just need to change the URL in there? Note that the sample uses the platform-default encoding, which is a bad idea.) Can you save the binary data to a file, and compare that with what curl downloads? That would isolate the problem. Commented May 23, 2014 at 5:45
  • Do you use SpringMVC on your serverside Controller? Commented May 26, 2014 at 13:22

3 Answers 3

5
+100

How do you get the json response data from HttpClient?

If you get it back in binary form (through getResponseBodyAsStream() for example), and then convert it to a String without specifying charset, then the result depends on your JVM's default charset.

You can check the value of JVM default charset by:

Charset.defaultCharset().name()

This might give "UTF-8" on all machines except the one failing.

Sign up to request clarification or add additional context in comments.

Comments

2

Without seeing your code, it is difficult to say what's wrong, but here is a "correct" way of doing this (using HttpClient 3.1.0 for request and Jackson 2.1.3 to parse the JSON).

import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;

import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.methods.GetMethod;
import org.apache.http.HttpStatus;

import java.io.IOException;
import java.io.InputStreamReader;

HttpClient hc = new HttpClient();
GetMethod get = new GetMethod(uri);
int status = hc.executeMethod(get);
if (status != HttpStatus.SC_OK) throw new RuntimeException("http status " + status);
ObjectMapper jsonParser = new ObjectMapper(new JsonFactory());
// we use an InputStreamReader with explicit charset to read the response body
JsonNode json = jsonParser.readTree(
    new InputStreamReader(get.getResponseBodyAsStream(), get.getResponseCharSet())
);

Comments

2

I already faced this issue and this was because of the encoding type configured in the client. So I had to make a "work around" like the one below:

String encmsg = new String(respStr.getBytes("ISO-8859-1"), java.nio.charset.Charset.forName("UTF-8"));

It reads the String as ISO-8859-1 and convert to UTF-8.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.