0

I have a large JSON file that contains image annotation data. I am iterating through one of the keys below.:

import json
  
# Opening JSON file
f = open('annotations.json')
  
# returns JSON object as 
# a dictionary
data = json.load(f)
  
# Iterating through the json
# list
for i in data['annotations']:
    if i['segmentation'] == [[]]:
        print(i['segmentation'])
        del i
    #print(i['segmentation'])
  
# Closing file
f.close()

Printing the returned dictionaries, they look like this:

{"iscrowd":0,"image_id":32,"bbox":[],"segmentation":[[]],"category_id":2,"id":339,"area":0}

I am trying to remove the following above lines in the annotations key that contain no data for segmentation. I am able to extract these lines, I am just not sure how to remove them without breaking the format of the file.

{"iscrowd":0,"image_id":32,"bbox":[],"segmentation":[[]],"category_id":2,"id":339,"area":0}
,{"iscrowd":0,"image_id":32,"bbox":[],"segmentation":[[]],"category_id":2,"id":340,"area":0}
,{"iscrowd":0,"image_id":32,"bbox":[],"segmentation":[[]],"category_id":2,"id":341,"area":0}
,{"iscrowd":0,"image_id":32,"bbox":[],"segmentation":[[]],"category_id":2,"id":342,"area":0},
...

Here is what finally got it working for me:

import json
  
# Opening JSON file
f = open('annotations.json')
  
# returns JSON object as 
# a dictionary
data = json.load(f)

# Closing file
f.close()

# Iterating through the json
# list
count = 0
for key in data['annotations']:
    count +=1
    if key['segmentation'] == [[]]:
        print(key['segmentation'])
        data["annotations"].pop(count)
    if key['bbox'] == []:
        data["annotations"].pop(count)
    #print(i['segmentation'])

with open("newannotations.json", "w") as json_file:  
        json.dump(data, json_file)

1 Answer 1

1

The function json.loads() returns a python dictionary, which you can then modify as you'd like. Similarly json.dumps() can be used to write a json file from a python dictionary.

In order to remove an entry from a dictionary, you can use the dictionary pop() method. Assuming in the above you want to delete each entry referred to with the key i (as per the del i) if the entry in data["annotations"][i]["segmentation"] ==[[]], one could do it approximately as follows:

import json
  
# Opening JSON file
f = open('annotations.json')
  
# returns JSON object as 
# a dictionary
data = json.load(f)

# Closing file
f.close()

# Iterating through the json
# list
for key in data['annotations']:
    if data["annotations"][key]['segmentation'] == [[]]:
        print(data["annotations"][key]['segmentation'])
        data["annotations"].pop(key)
    #print(i['segmentation'])

with open("newannotations.json", "w") as json_file:  
        json.dump(data, json_file)

Is this what you wanted to do?

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks this led me on the right track. I was getting a typeerror and slightly modified your answer and it worked. I will post my working solution but I will accept your answer. One of the issues I am having though is that it is not deleting every occurence. When I re-run the program it deletes more but still does not remove them. I have to re-run the program until it clears all of them. Perhaps this is because of the size of the file?
Yeah, since I didn't have access to the source file you were processing, I couldn't really debug it that efficiently, it was meant to be more of a template to work off of. Similarly I'm not sure why you would only be hitting some of the targets, but I hope you can figure it out with some debugging x)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.