0

I have an external xml file which I have to pick up, no encoding is set but I have discovered it it's payload is encoded ISO-8859-1.

I know this because if I manually edit the file to encoding="ISO-8859-1" then it is processed as expected.

Can I tell simplexml what encoding to deal with as I instantiate the simplexml object?

Addendum

Because the xml file was so dirty I might end up using xmllint - posting here for anyone else interested - format so it is indented, set encoding where it did not exist and clean up bad entities (& and so on)

xmllint --format --encode iso-8859-1 -o cleansed.xml dirty.xml
1
  • A guy in the comments on this page: php.net/manual/en/ref.simplexml.php confirms the issue, but doesn't offer a solution. Don't think there is one in simplexml. I think your best bet would be @JacobRas's DOM Document answer. Commented Aug 21, 2011 at 20:55

1 Answer 1

2

You can set the encoding for a DomDocument and then convert it to simplexml by using simplexml_import_dom():

$dom = new DomDocument('1.0', 'iso-8559-1');
$dom->load('externalfile.xml');

if (!$dom) {
    echo 'Parsing error';
    exit;
}

$s = simplexml_import_dom($dom);
Sign up to request clarification or add additional context in comments.

2 Comments

Niiice. I was gonna suggest editing the XML string through preg_replace but this is much neater.
This only works if the Dom ext is installed - anyhow I found so many other xml anomolies, that I likely will adopt Robin Winslows solution.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.