How to parse a JSON object into smaller objects using Python?

Question

I have a very large JSON object that I need to split into smaller objects and write those smaller objects to file.

Sample Data

raw = '[{"id":"1","num":"2182","count":-17}{"id":"111","num":"3182","count":-202}{"id":"222","num":"4182","count":12},{"id":"33333","num":"5182","count":12}]'

Desired Output (In this example, split the data in half)

output_file1.json = [{"id":"1","num":"2182","count":-17},{"id":"111","num":"3182","count":-202}]

output_file2.json = [{"id":"222","num":"4182","count":12}{"id":"33333","num":"5182","count":12}]

Current Code

import pandas as pd
import itertools
import json
from itertools import zip_longest


def grouper(iterable, n, fillvalue=None):
    args = [iter(iterable)] * n
    return zip_longest(fillvalue=fillvalue, *args)

    raw = '[{"id":"1","num":"2182","count":-17}{"id":"111","num":"3182","count":-202}{"id":"222","num":"4182","count":12},{"id":"33333","num":"5182","count":12}]'

#split the data into manageable chunks + write to files

for i, group in enumerate(grouper(raw, 4)):
    with open('outputbatch_{}.json'.format(i), 'w') as outputfile:
        json.dump(list(group), outputfile)

Current Output of first file "outputbatch_0.json"

["[", "{", "\"", "s"]

I feel like I'm making this much harder than it needs to be.

Your raw string isn't valid JSON (missing commas between objects). Is this the case with your real data or just a typo in the question? — sjw
– sjw, Commented Sep 18, 2018 at 14:14

stakka · Accepted Answer · 2018-09-18 14:32:54Z

2

assuming the raw should be a valid json string (I included the missing commas), here is a simple, but working solution.

import json

raw = '[{"id":"1","num":"2182","count":-17},{"id":"111","num":"3182","count":-202},{"id":"222","num":"4182","count":12},{"id":"33333","num":"5182","count":12}]'
json_data = json.loads(raw)

def split_in_files(json_data, amount):
    step = len(json_data) // amount
    pos = 0
    for i in range(amount - 1):
        with open('output_file{}.json'.format(i+1), 'w') as file:
            json.dump(json_data[pos:pos+step], file)
            pos += step
    # last one
    with open('output_file{}.json'.format(amount), 'w') as file:
        json.dump(json_data[pos:], file)

split_in_files(json_data, 2)

answered Sep 18, 2018 at 14:32

stakka

911 silver badge4 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

tkircsi · Accepted Answer · 2018-09-18 14:35:38Z

0

if raw is valid json. the saving part is not detailed.

import json

raw = '[{"id":"1","num":"2182","count":-17},{"id":"111","num":"3182","count":-202},{"id":"222","num":"4182","count":12},{"id":"33333","num":"5182","count":12}]'

raw_list = eval(raw)
raw__zipped = list(zip(raw_list[0::2], raw_list[1::2]))

for item in raw__zipped:
    with open('a.json', 'w') as f:
        json.dump(item, f)

answered Sep 18, 2018 at 14:35

tkircsi

3231 gold badge4 silver badges14 bronze badges

Comments

Abdul Majeed · Accepted Answer · 2018-09-18 14:50:06Z

0

If you need the exactly half of the data you can use slicing:

import json

raw = '[{"id":"1","num":"2182","count":-17},{"id":"111","num":"3182","count":-202},{"id":"222","num":"4182","count":12},{"id":"33333","num":"5182","count":12}]'
json_data = json.loads(raw)

size_of_half = len(json_data)/2

print json_data[:size_of_half]
print json_data[size_of_half:]

In shared code basic cases are not handled like what if length is odd etc, In short You can do everything that you can do with list.

answered Sep 18, 2018 at 14:50

Abdul Majeed

2,80127 silver badges29 bronze badges

Collectives™ on Stack Overflow

How to parse a JSON object into smaller objects using Python?

3 Answers 3

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related