When doing something like json_encode($_SERVER) I get an error because the input to be JSON-encoded is not valid UTF-8. In fact, I looked into this error and noticed some user agent strings were encoded in ISO-8859-1. How do I know what encoding was used for the HTTP request, so that I can use utf8_encode() or iconv() as appropriate to be able to JSON-encode the data?
1 Answer
From what I can tell, the standard doesn't say: http://www.w3.org/Protocols/rfc2616/rfc2616.html
A request body should have an encoding in a Content-Type header, but header values should be plain ascii. Anything beyond that isn't specified from what I can tell. I'm not sure that it is strictly wrong but there's apparently no standard you can use for a call to iconv.
What I would do is just loop through the string and remove any non-ASCII value. Maybe you could hack it out with a simple str_replace call or preg_replace remove non-ascii characters from string in php .
User-Agentshould be ASCII. The HTTP request and response body are encoded using the text encoding specified in thecharsetattribute of theContent-Typeheader.ISO-8859-1. Any other character set is supposed to be escaped using the scheme of RFC2231 (not commonly supported). Some clients will use UTF-8 or %-escaped UTF-8 in headers; the latter is preferable in most cases.