Why do I need to do `json.loads` twice to parse a JSON string?

Question

I'm trying to parse a JSON string with

json.loads(json_string)

but it returns a string instead of a dict. I can get the expected result by parsing it again

json.loads(json.loads(json_string))

but I don't understand why.

I receive a bytes object from a webhook:

bytes_object = b'"{\\"action\\":\\"connection_test\\",\\"data\\":{}}"'

The bytes object is then utf-8 decoded:

decoded_bytes = bytes_object.decode('utf-8')
"{\"action\":\"connection_test\",\"data\":{}}"

Then, the utf-8 decoded object is parsed using json.loads:

parsed_once = json.loads(decoded_bytes)

But this doesn't return a dict, but a string object looking like this:

{"action":"connection_test","data":{}}

of type <class 'str'>.

But if I parse it again I get the dict expected from the first try:

parsed_twice = json.loads(parsed_once)
{'action': 'connection_test', 'data': {}}

of type <class 'dict'>.

I suspect it's something about how Python 3.9 handles JSON escaping, but I'm not sure. Any help?

Do you control the webhook's output? Because it seems to be double-encoding the original dict when converting to JSON (similar to the reverse of calling json.dumps twice). — Gino Mempin
– Gino Mempin, Commented May 19, 2021 at 12:46
There's something wonky about the bytes object you receive there. You can see the bytestring's first and last characters (within the quotes) are ", so the bytestring indeed is a JSON string that contains a JSON representation of an object. — AKX
– AKX, Commented May 19, 2021 at 13:17
The JSON is double encoded, so you need to double-decode it. — deceze
– deceze ♦, Commented May 19, 2021 at 13:21
There is nothing wrong with what you are doing, with json.loads twice. And it isn't a Python 3.9 problem. The more pressing question is why did the webhook needed to double-encode the data... which isn't answerable in this current state. — Gino Mempin
– Gino Mempin, Commented May 19, 2021 at 13:29
A "regular" JSON object would look like b'{"action": "connection_test", "data": {}}'. What you are getting is a JSON object that is encoded again to produce a JSON string. The first json.load decodes the JSON str into a JSON object; the second decodes the JSON object into a Python dict. — chepner
– chepner, Commented May 19, 2021 at 13:34

deceze · Accepted Answer · 2021-05-20 07:11:53Z

9

The JSON is double encoded, so it needs to be double-decoded. It went something like this:

>>> import json
>>> data = {'action': 'connection_test', 'data': {}}
>>> a = json.dumps(data)
>>> print(a)
{"action": "connection_test", "data": {}}
>>> b = json.dumps(a)
>>> print(b)
"{\"action\": \"connection_test\", \"data\": {}}"

That's a mistake that needs to be rectified on the producer side. As long as the producer gives you this double encoded JSON, you need to double decode it.

answered May 20, 2021 at 7:11

deceze♦

525k89 gold badges806 silver badges954 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Glen Davies Over a year ago

Possibly some JavaScript is running JSON.stringify() on something that is already valid JSON.

Collectives™ on Stack Overflow

Why do I need to do `json.loads` twice to parse a JSON string?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related