0

I have a json file that contains about 100,000 lines in the following format:

{
"00-0000045": {
    "birthdate": "5/18/1975",
    "college": "Michigan State",
    "first_name": "Flozell",
    "full_name": "Flozell Adams",
    "gsis_id": "00-0000045",
    "gsis_name": "F.Adams",
    "height": 79,
    "last_name": "Adams",
    "profile_id": 2499355,
    "profile_url": "http://www.nfl.com/player/flozelladams/2499355/profile",
    "weight": 338,
    "years_pro": 13
},
"00-0000108": {
    "birthdate": "12/9/1974",
    "college": "Louisville",
    "first_name": "David",
    "full_name": "David Akers",
    "gsis_id": "00-0000108",
    "gsis_name": "D.Akers",
    "height": 70,
    "last_name": "Akers",
    "number": 2,
    "profile_id": 2499370,
    "profile_url": "http://www.nfl.com/player/davidakers/2499370/profile",
    "weight": 200,
    "years_pro": 16
    }
}

I am trying to delete all the items that do not have a gsis_name property. So far I have this python code, but it does not delete any values (note: I do not want to overwrite the original file)

import json

with open("players.json") as json_file:
    json_data = json.load(json_file)
    for x in json_data:
        if 'gsis_name' not in x:
            del x
print json_data
0

3 Answers 3

2

You're deleting x, but x is a copy of the original element in json_data; deleting x won't actually delete it from the object that it was drawn from.

In Python, if you want to filter some items out of a collection your best bet is to copy the items you do want into a new collection.

clean_data =  {k: v for k, v in json_data.items() if 'gsis_name' in v}

and then write clean_data to a file with json.dump.

Sign up to request clarification or add additional context in comments.

Comments

0

When you say del x, you are unassigning the name x from your current scope (in this case, global scope, since the delete is not in a class or function).

You need to delete it from the object json_data. json.load returns a dict because your main object is an associative array / map / Javascript object. When you iterate a dict, you are iterating over the keys, so x is a key (e.g. "00-0000108"). This is a bug: You want to check whether the value has the key gsis_name.

The documentation for dict shows you how to delete from a dict using the key: https://docs.python.org/3/library/stdtypes.html#mapping-types-dict

del d[key]

Remove d[key] from d. Raises a KeyError if key is not in the map.

But as the other answers say, it's better to create a new dict with the objects you want, rather than removing the objects you don't want.

Comments

0

Just create new dict without unwanted elements:

res = dict((k, v) for k, v in json_data.iteritems() if 'gsis_name' in json_data[k])

Since Python 2.7 you could use a dict comprehension.

1 Comment

Note that in Python 3, you would use json_data.items() instead of json_data.iteritems().

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.