4

I am using the awesome Requests module to test an API I've created for one of our internal projects. I believe I have discovered what is either a flaw in the Requests module itself, or a flaw in my usage of it.

Because our data is not super sensitive, our API uses simple, basic HTTP authentication to control acces. When I make requests of the API URL, using JSON as the data format and either urllib2 with HTTPBasicAuthHandler or PHP and cURL, I get my data back as a properly formatted JSON string - no problem.

However, when I make the same request using the Requests module, I get back an encoded string, and I cannot determine what type of encoding it is. Here is a snippet of the beginning of that string:

\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03\xadZ\xfb\x8f\xd3H\x12\xfeWzG\xab;\x90

Here are the few lines of code I am using with Requests to reproduce this issue:

import requests
# api_user and api_pw not printed here for security reasons
r = requests.get('http://ourdomain.com/api/featured/school/json', auth=(api_user, api_pw))
status = r.status_code # Produces 200 every time
rawdata = r.read()
print rawdata

And I get that encoded string each time I do that.

Can anyone help me to determine: a) What encoding that is (for my own edification), and b) Why Requests is returning data in that encoding, and how to decode and/or "fix" it.

Thanks in advance!

1 Answer 1

6

Out of curiosity, what do you get when you print r.content ?

Sign up to request clarification or add additional context in comments.

7 Comments

That's interesting! I didn't even see that method when I did: dir(r) That outputs the JSON string. Is that the method that should be called, instead of read()?
Upon further reflection, I can see how that might be a misuse on my end of the library (i.e. - should have called r.content instead of r.read()), but it doesn't explain why the output is different between my development virtual machine (all other factors being the same, outputs the JSON string when calling r.read()) and the production box (which outputs that encoded string). Any ideas why the output is different?
@waveslider I don't know anything about requests other than that it's on my list of things to look into, but at a guess I'd say it has to do with default encodings. Your dev box is probably UTF-8 (which all JSON is supposed to be) and the server is something else. I'm guessing the .content property is looking at all the encoding headers, etc. and applying them, while .read() is just pulling the bytes off the wire, and since it's encoded differently, you get the bytes. Again, all of that is just guessing.
Your OS does have a default encoding, but I don't know exactly how Python interacts with that. I'm almost positive there's a way to override it, but I don't know it off the top of my head. It might help to read the Unicode HOWTO. The best solution is probably to use .content, since that is working and is the way the example code works.
Yes, Python does get the default encoding from the system. It depends on the Python version and the platform and configuration. Here's a great resource for in-depth information: farmdev.com/talks/unicode
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.