1

I am trying to load and read a json file with this code:

try:
    json_data = open('sample3.json')
    data = load(json_data)
    json_data.close()
    insert_data(data)
except Exception as e:
    print "Finished with error %s" % (repr(e))

This is the Json file:

{"competitions":
    [
    {"name":"Premiership","nation":"ENG","id":32711,"matches": 
        [
        {"id":7245940,"when":"28.02.2015 12:45",
            "home_team": {"id":430934, "name":"West Ham"},
            "away_team": {"id":430936, "name":"Crystal Palace"},
            "played":1,
            "play_off":0,
            "round":27
                ,"score":{"t1_score":1,"t2_score":3 },
            "score_ht":{"t1_score":0,"t2_score":1}
        }
        ]
    }
    ]
}

and this is the error I am getting: Finished with error ValueError('No JSON object could be decoded',)

I tried file in JSONlint and it says it is valid.

What am I doing wrong?

UPDATE: this is the output of print repr(json_data.read())

'\xef\xbb\xbf{"competitions":\n    [\n    {"name":"Premiership","nation":"ENG","id":32711,"matches": \n        [\n        {"id":7245940,"when":"28.02.2015 12:45",\n            "home_team": {"id":430934, "name":"West Ham"},\n            "away_team": {"id":430936, "name":"Crystal Palace"},\n            "played":1,\n            "play_off":0,\n            "round":27\n                ,"score":{"t1_score":1,"t2_score":3 },\n            "score_ht":{"t1_score":0,"t2_score":1}\n        }\n        ]\n    }\n    ]\n}\n'
Finished with error ValueError('No JSON object could be decoded',)
3
  • 1
    Are you 100% certain you are opening the correct file? What does print repr(json_data.read()) produce? Commented Mar 12, 2015 at 12:00
  • try removing last newline character from json string data Commented Mar 12, 2015 at 12:07
  • I can't find a way. I opened the file in vim and it doesn't show anything. set list command just shows a $ at the end of the last line. Commented Mar 12, 2015 at 12:17

1 Answer 1

6

Your JSON file starts with a UTF-8 BOM (Byte Order Mark) character; JSON doesn't support such a character. It is usually added by Microsoft tools (such as Notepad), to detect encodings, but the characters carry no meaning in UTF-8 since there is no byte order variation.

You'll have to skip these bytes directly, as even using the utf-8-sig encoding doesn't help here.

You can use codecs.BOM_UTF8 to detect it:

import codecs

with open('sample3.json') as json_data:
    bom_maybe = json_data.read(3)
    if bom_maybe != codecs.BOM_UTF8:
        # no BOM at the start, rewind
        json_data.seek(0)
    data = load(json_data)
insert_data(data)

Alternatively, use io.open() to load and decode the data, before passing it to json.loads() instead:

import io

with io.open('sample3.json', encoding='utf-8-sig') as json_data:
    data = json.loads(json_data.read())

Demo:

>>> import codecs
>>> import json
>>> open('/tmp/test.json', 'wb').write('\xef\xbb\xbf{"competitions":\n    [\n    {"name":"Premiership","nation":"ENG","id":32711,"matches": \n        [\n        {"id":7245940,"when":"28.02.2015 12:45",\n            "home_team": {"id":430934, "name":"West Ham"},\n            "away_team": {"id":430936, "name":"Crystal Palace"},\n            "played":1,\n            "play_off":0,\n            "round":27\n                ,"score":{"t1_score":1,"t2_score":3 },\n            "score_ht":{"t1_score":0,"t2_score":1}\n        }\n        ]\n    }\n    ]\n}\n')
>>> with open('/tmp/test.json') as json_data:
...     bom_maybe = json_data.read(3)
...     if bom_maybe != codecs.BOM_UTF8:
...         json_data.seek(0)
...     data = json.load(json_data)
... 
>>> data
{u'competitions': [{u'id': 32711, u'matches': [{u'score_ht': {u't2_score': 1, u't1_score': 0}, u'home_team': {u'id': 430934, u'name': u'West Ham'}, u'away_team': {u'id': 430936, u'name': u'Crystal Palace'}, u'played': 1, u'when': u'28.02.2015 12:45', u'round': 27, u'score': {u't2_score': 3, u't1_score': 1}, u'play_off': 0, u'id': 7245940}], u'name': u'Premiership', u'nation': u'ENG'}]}
>>> with io.open('/tmp/test.json', encoding='utf-8-sig') as json_data:
...     data = json.loads(json_data.read())
... 
>>> data
{u'competitions': [{u'id': 32711, u'matches': [{u'score_ht': {u't2_score': 1, u't1_score': 0}, u'home_team': {u'id': 430934, u'name': u'West Ham'}, u'away_team': {u'id': 430936, u'name': u'Crystal Palace'}, u'played': 1, u'when': u'28.02.2015 12:45', u'round': 27, u'score': {u't2_score': 3, u't1_score': 1}, u'play_off': 0, u'id': 7245940}], u'name': u'Premiership', u'nation': u'ENG'}]}
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks. Now the junk is removed. I still get the same error message.
@xpanta: how did you remove the bytes? The editor that you are using is probably still including them when saving the file! Instead, detect if the first 3 bytes form the BOM so you can skip them.
@xpanta, the BOM characters are there in your message as well "\xef\xbb\xbf", i did not notice them myself
@NikosM.: I asked the OP to add the output of repr(json_data.read()) to their question, because I suspected a BOM might be involved.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.