Count unique values in objects within large JSON file with Python

Question

I have some rather large JSON files. Each contains thousands of objects within one (1) array. The JSONs are structured in the following format:

{
    "alert": [
    { "field1": "abc",
    "field2": "def",
    "field3": "xyz
},
{ "field1": null,
"field2": null,
"field3": "xyz",
},
...
...
]

What's the most efficient way to use Python and the json library to search through a JSON file, find the unique values in each object within the array, and count how many times they appear? E.g., search the array's "field3" objects for the value "xyz" and count how many times it appears. I tried a few variations based on existing solutions in StackOverflow, but they are not providing the results I'm looking for.

jq170727 · Accepted Answer · 2017-09-09 03:07:17Z

A quick search on PyPI turned up

ijson 2.3 - Iterative JSON parser with a standard Python iterator interface
https://pypi.python.org/pypi/ijson

Here's an example which should work for your data

import ijson

counts = {}
with file("data.json") as f:
    objects = ijson.items(f, 'alert.item')
    for o in objects:
        for k, v in o.items():
            field = counts.get(k,{})
            total = field.get(v,0)
            field[v] = total + 1
            counts[k] = field

import json
print json.dumps(counts, indent=2)

running this with your sample data in data.json produces

{
  "field2": {
    "null": 1, 
    "def": 1
  }, 
  "field3": {
    "xyz": 2
  }, 
  "field1": {
    "null": 1, 
    "abc": 1
  }
}

Note however that the null in your input was transformed into the string "null".

As a point of comparison, here is a jq command which produces an equivalent result using tostream

 jq -M '
    reduce (tostream|select(length==2)) as [$p,$v] (
      {}
    ; ($p[2:]+[$v|tostring]) as $k
    | setpath($k; getpath($k)+1)
    )
' data.json

Collectives™ on Stack Overflow

Count unique values in objects within large JSON file with Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related