1

Below method is designed to get source code of html page of given url, but it returns result in different charset in each call (in every call argument url is same), please, explain me why?

private String getSourceCode(URL url) {
    HttpURLConnection conn = (HttpURLConnection)url.openConnection();
    conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Macintosh; U; Intel MacOS X 10.4; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2");

    return IOUtils.toString(conn.getInputStream()); 
}
6
  • the remote site has a sense of humor. Commented Sep 7, 2015 at 18:21
  • maybe :) when I check response contect type charset is always UTF-8, but result is different... Commented Sep 7, 2015 at 18:26
  • Can you check what is the difference in the output? Commented Sep 7, 2015 at 18:27
  • first call: ...y??????Ywmm?Vs??B?0?/M??gJ?l?p.??n.??pBo??N... second call: normal html code url: http://habrahabr.ru/post/266163/ Commented Sep 7, 2015 at 18:37
  • difference is in output string encoding, but I don't know why its different at each call Commented Sep 7, 2015 at 18:42

1 Answer 1

1

Actually there are several possible reasons. For example behind the URL there can actually be several different servers with different default encoding of response. First call can be served by server with utf-8, the other can be served by another server with another encoding.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.