I am using JSoup to parse a gb2312 charset page: http://vars.sinaapp.com/u/t/jsoup_output_encoding_issue.html
source code:
String testURL="http://vars.sinaapp.com/u/t/jsoup_output_encoding_issue.html";
Document doc=Jsoup.connect(testURL).get();
System.out.println(
doc.select("div").html()
);
this gives the following output:
1:? 2:� 3:� 4:—
I want to get same with page source code:
1:· 2:慒 3:啰 4:—
Is there any way to do this?