0

I want to send a post request to REST api, but with all my characters unicode encoded, for example the string test i want to send as \u0074\u0065\u0073\u0074. Whatever i try, the string ends up as \\u0074\\u0065\\u0073\\u0074. I can easily modify the request in for example Burp, and remove the double backslashes, to make it work.

So the raw bytes sent to the webserver is \x5c\x5c\x75\x30\x30\x37\x34\x5c\x5c\x75\x30\x30\x36\x35\x5c\x5c\x75\x30\x30\x37\x33\x5c\x5c\x75\x30\x30\x37\x34

While what i want is: \x5c\x75\x30\x30\x37\x34\x5c\x75\x30\x30\x36\x35\x5c\x75\x30\x30\x37\x33\x5c\x75\x30\x30\x37\x34

One of the things I've tried is this:

import requests

s = 'test'
data = ''
for c in s:
    data +=  "\\u00"+hex(ord(c))[2:].lower()
print(data)
json =  {"user":data}
res = requests.post('http://127.0.0.1/api/getusers', json=json)
print(res.text)

even if i set data = '\x5c\x75\x30\x30\x37\x34\x5c\x75\x30\x30\x36\x35\x5c\x75\x30\x30\x37\x33\x5c\x75\x30\x30\x37\x34' is still sends double back slahes (\x5x\x5c)

1 Answer 1

1

It works fine for me. Tested with https://httpbin.davecheney.com/post, Python 3.7 and Requests 2.23.0:

import requests, json

url = r"https://httpbin.davecheney.com/post"

data_raw_str = r"\u0074\u0065\u0073\u0074"

s = 'test'
data = ''
for c in s:
    data += '\\u00' + hex(ord(c))[2:].lower()
    #data += fr"\u{ord(c):04x}" # this works, too
    
json_dict = {'user': data}
r = requests.post(url, json=json_dict)
print(r)

data_returned = json.loads(r.json()['data'])['user']

print(data_raw_str)
print(data)
print(data_returned)
print(data_raw_str == data == data_returned)
print(requests.__version__)

Output:

<Response [200]>
\u0074\u0065\u0073\u0074
\u0074\u0065\u0073\u0074
\u0074\u0065\u0073\u0074
True
2.23.0

Edit:

According to RFC 8259 - The JavaScript Object Notation (JSON) Data Interchange Format - 7. Strings:

All Unicode characters may be placed within the quotation marks, except for the characters that MUST be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F).

So backslashes will always be escaped with another backslash in JSON.

I believe manually removing the extra backslashes will cause the server's JSON decoder to unescape the unicode literals so your string will become plain old test.

Why does the request have to be JSON?

If you make this request, no additional backslashes are added:

requests.post(url, data=data) # data is a str

And if you make this request, the keys and values are utf-8 encoded, and then url encoded (the single backslash is replaced with %5C):

requests.post(url, data=json_dict)

Sign up to request clarification or add additional context in comments.

3 Comments

It seems to be working from the client side. But if you either intercept the request using a proxy, or look at the data in Wireshark, you can see that the data sent is with \x5c\x5c, or `\` ascii. I tested your example, and it yields the same result.
Yes, you're right. I've updated my answer. Long story short, JSON strings cannot include an unescaped (single) backslash. Make a POST request that doesn't send JSON if you mustn't have unescaped backslashes.
the server only accepts json, but i guess i can create json as a string and send. Thanks for the in depth answer :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.