2

I have a JSON file with multiple duplicate keys in the following format:

"data": {
    "nameA": {
        "result": someInt,
        "timestamp": "someTime"
    },
    "nameB": {
        "result": someInt,
        "timestamp": "someTime"
    },
    "nameA": {
        "result": someInt,
        "timestamp": "someTime"
    },
    "nameC": {
        "result": someInt,
        "timestamp": "someTime"
    }
}

I need to dynamically determine the number of instances of each key and print them out. What would be the best way to accomplish this for a JSON in this format?

4
  • 7
    Rather than be concerned with the "best" way, have you found any way to make this work? What are you unhappy about with your current approach? Commented Aug 16, 2018 at 13:51
  • 1
    Not exactly a duplicate, but this question might help you out: stackoverflow.com/questions/29321677/… Commented Aug 16, 2018 at 13:57
  • @roganjosh I haven't found an approach that works yet. The JSON default parser updates the dictionary while writing, and only the last entry is stored. So I need a way to catch duplicates while parsing, or find a way to store each individual object within data during the json.loads function Commented Aug 16, 2018 at 13:58
  • @SimonBrahan I think that the Counter solution found here is on the right track, but instead of rejecting the keys, I need to find a way to keep track of each of them. Commented Aug 16, 2018 at 14:06

1 Answer 1

1

Based upon the answer given to this question: json.loads allows duplicate keys in a dictionary, overwriting the first value , this should work:

import json

testjson = '{"data": {"key1": "val", "key2": "val", "key1": "val"}}'


def parse_multimap(ordered_pairs):
    multimap = dict()
    for k, v in ordered_pairs:
        if k in multimap:
            multimap[k].append(v)
        else:
            multimap[k] = [v]
    return multimap

parsed = json.loads(testjson, object_pairs_hook=parse_multimap)

for key in parsed['data'][0]:
    print("Key: {} | Count: {}".format(key, len(parsed['data'][0][key])))

Output:

Key: key2 | Count: 1
Key: key1 | Count: 2
Sign up to request clarification or add additional context in comments.

2 Comments

Because your testjson is in a different format as mine, I just get Counter(('data': 1)). How can I get it to look inside "data" before performing this operation?
Perfect. Thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.