0
$convertedhtml = urlencode(mb_convert_encoding($htmlcode,'UTF-8',"auto"));
$doc = new DOMDocument();
$doc->loadHTML($convertedhtml);

$xpath = new DOMXpath($doc);
$elements = $xpath->query("//*[@id='detail']/div[1]/h3/text()");
$elements->item(0)->nodeValue;

return ($elements->item(0)->nodeValue);

The website is in gbk encoding. If i do a Convert , it will not even show anything, but if i dont convert, it doesnt show the correct characters.

Any idea ? From what i know, mb_* doesn't support gbk?

2

1 Answer 1

1

The DOMDocument::loadHTML() method does not expect an UTF-8 encoded string. So you can say it is an exception to the many other methods in the DOM extension because all those expect an UTF-8 encoded string. Same btw. applies to all methods of the DOM extension that care about loading XML/HTML data from either a file, a remote-location or a string. They follow different and more complex rules for the encoding of the string.

Encoding for DOMDocument::loadHTML():

If the HTML string you pass in there does not contain any hinting on the encoding (e.g. inside meta-tags), then the encoding of the string must be Latin-1.

If the string does have a hint of the encoding, then it needs to be in that hinted encoding and that one needs to be one of the supported encodings.

Notes:

  • I'm not aware if a list of supported encodings exists.
  • As you don't show your HTML code you load in there, I can't say if it contains a hint on the encoding.
  • I'm not aware if a list of all supported ways to hint the encoding with HTML for DOMDocument::loadHMTL() exists.

However: For an example on how to load a HTML document or fragment of a specific encoding see this related answer of mine:

It most likely will show you how you can load your HTML. It also explains this in more detail. Let me know if it doesn't solve your issue.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.