1

I'm trying to create a program that can identify which delimiter characters are missing and insert them in their respective position within a JSON file. So, for instance, let's suppose my JSON looks like this:

{
    "array": [{
        "id": "123",
        "info": {
            "name": "something"
        }
        "address": {
            "street": "Dreamland"
        }
    }]
}

This JSON is invalid since there's no , between } and "address".

Here's the thing: when I try to use json.loads(my_json.read()), it'll throw a JSONDecodeError exception, which will tell me what character is missing and at what character position it should be, like so:

json.decoder.JSONDecodeError: Expecting ',' delimiter: line 7 column 9 (char 116)

I've already tried printing my_json.seek(116) which didn't work at all, it just printed the number 116.

So my question is: is it possible to actually use the exception error message to fix the JSON since it shows what delimiter character is missing and at what position it should be, or should I just use a stack data structure to keep track of all possible delimiters and insert the missing ones? Or is there a better way to do this?

I've seen other questions in here like this one and this one but the first one is an encoding issue, which is not my case, and the second one is for a very specific case.

Sorry if some of this seems dumb, I'm still learning how to code in Python.

Thanks for your time :)

1 Answer 1

2

You can slice the string at the position where the delimiter ',' is missing (stored as the pos attribute of exception object), and join them with ',':

import json

s = '[{"a":1}{"b":2}{"c":3}]'
while True:
    try:
        data = json.loads(s)
        break
    except json.decoder.JSONDecodeError as e:
        if not e.args[0].startswith("Expecting ',' delimiter:"):
            raise
        s = ','.join((s[:e.pos], s[e.pos:]))
print(s)
print(data)

This outputs:

[{"a":1},{"b":2},{"c":3}]
[{'a': 1}, {'b': 2}, {'c': 3}]
Sign up to request clarification or add additional context in comments.

3 Comments

You can also use the colno, lineno, and/or pos attributes of the JSONDecodeError to save on some regex-ing. To make it even more simple, you can also pre-minify it to get rid of the whitespace, then you can just use pos to find the insertion point: minified = ''.join(x.strip() for x in s.splitlines())
Updated to use the pos attribute as suggested. Thanks! Don't think minification makes any difference though--the pos attribute is just as accurate even with redundant white spaces.
Good point! For some reason I thought pos and colno were the same.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.