6

I'm trying to parse a JSON string with

json.loads(json_string)

but it returns a string instead of a dict. I can get the expected result by parsing it again

json.loads(json.loads(json_string))

but I don't understand why.

I receive a bytes object from a webhook:

bytes_object = b'"{\\"action\\":\\"connection_test\\",\\"data\\":{}}"'

The bytes object is then utf-8 decoded:

decoded_bytes = bytes_object.decode('utf-8')
"{\"action\":\"connection_test\",\"data\":{}}"

Then, the utf-8 decoded object is parsed using json.loads:

parsed_once = json.loads(decoded_bytes)

But this doesn't return a dict, but a string object looking like this:

{"action":"connection_test","data":{}}

of type <class 'str'>.

But if I parse it again I get the dict expected from the first try:

parsed_twice = json.loads(parsed_once)
{'action': 'connection_test', 'data': {}}

of type <class 'dict'>.

I suspect it's something about how Python 3.9 handles JSON escaping, but I'm not sure. Any help?

8
  • 1
    Do you control the webhook's output? Because it seems to be double-encoding the original dict when converting to JSON (similar to the reverse of calling json.dumps twice). Commented May 19, 2021 at 12:46
  • 3
    There's something wonky about the bytes object you receive there. You can see the bytestring's first and last characters (within the quotes) are ", so the bytestring indeed is a JSON string that contains a JSON representation of an object. Commented May 19, 2021 at 13:17
  • 3
    The JSON is double encoded, so you need to double-decode it. Commented May 19, 2021 at 13:21
  • 2
    There is nothing wrong with what you are doing, with json.loads twice. And it isn't a Python 3.9 problem. The more pressing question is why did the webhook needed to double-encode the data... which isn't answerable in this current state. Commented May 19, 2021 at 13:29
  • 2
    A "regular" JSON object would look like b'{"action": "connection_test", "data": {}}'. What you are getting is a JSON object that is encoded again to produce a JSON string. The first json.load decodes the JSON str into a JSON object; the second decodes the JSON object into a Python dict. Commented May 19, 2021 at 13:34

1 Answer 1

9

The JSON is double encoded, so it needs to be double-decoded. It went something like this:

>>> import json
>>> data = {'action': 'connection_test', 'data': {}}
>>> a = json.dumps(data)
>>> print(a)
{"action": "connection_test", "data": {}}
>>> b = json.dumps(a)
>>> print(b)
"{\"action\": \"connection_test\", \"data\": {}}"

That's a mistake that needs to be rectified on the producer side. As long as the producer gives you this double encoded JSON, you need to double decode it.

Sign up to request clarification or add additional context in comments.

1 Comment

Possibly some JavaScript is running JSON.stringify() on something that is already valid JSON.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.