0

I have two json files as given below. And I want to find the difference between the two and write the differences to third json file. I am expecting json diff should be calculated- (B.json-A.json)

A.json

  [{
        "Number": 123,
        "brand": "Ford",
        "model": {
            "Mustang1": "2.64",
            "Mustang2": "3.00",
            "Mustang3": "1.00",
            "Mustang4": "1.64"
        }
    },
    {
        "Number": 321,
        "brand": "Toyota",
        "model": {
            "Camry": "2.64",
            "Prius": "3.00",
            "Corolla": "1.00",
            "Tundra": "1.64"
        }
    },
    {
        "Number": 111,
        "brand": "Honda",
        "model": {
            "Accord": "2.64",
            "Civic": "3.00",
            "Insight": "1.00",
            "Pilot": "1.64"
        }
    },
    {
        "Number": 891,
        "brand": "Ford",
        "model": {
            "Mustang1": "2.64",
            "Mustang8": "3.00",
            "Mustang3": "1.00",
            "Mustang6": "1.64"
        }
    },
    {
        "Number": 745,
        "brand": "Toyota",
        "model": {
            "Camry": "2.64",
            "Sienna": "3.00",
            "4Runner": "1.00",
            "Prius": "1.64"
        }
    },
    {
        "Number": 325,
        "brand": "Honda",
        "model": {
            "Accord": "2.64",
            "Passport": "3.00",
            "HR-V": "1.00",
            "Pilot": "1.64"
        }
    },
    {
        "Number": 745,
        "brand": "Accura",
        "model": {
            "TLX": "2.64",
            "MDX": "3.00"
        }
    },
    {
        "Number": 325,
        "brand": "Accura",
        "model": {
            "TLX": "2.64",
            "MDX": "3.00"
        }
    }
]

B.json

        [{
        "Number": 123,
        "brand": "Ford",
        "model": {
            "Mustang1": "2.64",
            "Mustang2": "3.00",
            "Mustang5": "1.64"
        }
    },
    {
        "Number": 321,
        "brand": "Toyota",
        "model": {
            "Camry": "2.64",
            "Prius1": "3.00",
            "Corolla1": "1.00",
            "Tundra": "1.64"
        }
    },
    {
        "Number": 111,
        "brand": "Honda",
        "model": {
            "Accord1": "2.64",
            "Civic1": "3.00",
            "Insight": "1.00",
            "Pilot": "1.64"
        }
    },
    {
        "Number": 891,
        "brand": "Ford",
        "model": {
            "Mustang1": "2.64",
            "Mustang8": "3.00",
            "Mustang3": "1.00",
            "Mustang6": "1.64"
        }
    },
    {
        "Number": 745,
        "brand": "Toyota",
        "model": {
            "Camry2": "2.64",
            "Sienna2": "3.00",
            "4Runner": "1.00",
            "Prius": "1.64"
        }
    },
    {
        "Number": 325,
        "brand": "Honda",
        "model": {
            "Accord": "2.64",
            "Passport2": "3.00",
            "HR-V2": "1.00",
            "Pilot": "1.64"
        }
    },
    {
        "Number": 745,
        "brand": "Accura",
        "model": {
            "TLX": "2.64",
            "MDX2": "3.00"
        }
    },
    {
        "Number": 325,
        "brand": "Accura",
        "model": {
            "TLX1": "2.64",
            "MDX": "3.00"
        }
    }
]

This prints:

{0: {'model': {'$delete': ['Mustang3', 'Mustang4'],
               'Mustang2': '1.00',
               'Mustang5': '1.64'}},
 1: {'model': {'$delete': ['Prius', 'Corolla'],
               'Corolla1': '1.00',
               'Prius1': '3.00'}},
 2: {'model': {'$delete': ['Accord', 'Civic'],
               'Accord1': '2.64',
               'Civic1': '3.00'}},
 4: {'model': {'$delete': ['Camry', 'Sienna'],
               'Camry2': '2.64',
               'Sienna2': '3.00'}},
 5: {'model': {'$delete': ['Passport', 'HR-V'],
               'HR-V2': '1.00',
               'Passport2': '3.00'}},
 6: {'model': {'$delete': ['MDX'], 'MDX2': '3.00'}},
 7: {'model': {'$delete': ['TLX'], 'TLX1': '2.64'}}}

Expected result: is calculated based on B.json-A.json. It will check what all keys were there in model in A.json but not present in B.json - grouped by other keys- Number, brand.

{"Number": 123, "brand": "Ford", 'model': {'Mustang2': '1.00', 'Mustang5': '1.64'}},
{"Number": 321, "brand": "Toyota", 'model': {'Corolla1': '1.00', 'Prius1': '3.00'}},
{"Number": 111, "brand": "Honda", 'model': {'Accord1': '2.64', 'Civic1': '3.00'}},
{"Number": 745, "brand": "Toyota", 'model': {'Camry2': '2.64', 'Sienna2': '3.00'}},
{"Number": 325, "brand": "Honda", 'model': {'HR-V2': '1.00', 'Passport2': '3.00'}},
{"Number": 745, "brand": "Accura", 'model': {'MDX2': '3.00'}},
{"Number": 325, "brand": "Accura", 'model': {'TLX1': '2.64'}}}
7
  • Yes, A.json and B.json - are valid json files. Commented May 6, 2019 at 13:55
  • You can use https://jsonlint.com/ to check this. Commented May 6, 2019 at 13:58
  • @interjay Obviously these are "json-per-line" files. Commented May 6, 2019 at 13:59
  • 1
    Both a_li and b_li contain the same json object! Perhaps you need to edit your question @user15051990 Commented May 6, 2019 at 13:59
  • yes looks like Mustang 2 was meant to be Mustang 3 in B.json Commented May 6, 2019 at 14:00

2 Answers 2

2

jsondiff doesn't do what you (probably) are trying to.
If A and B lists have not necessarily same Number and brand:

res = []
for b in B:
    r = dict(b)
    b_in_A = next((a for a in A if b["Number"] == a["Number"] and b["brand"] == a["brand"]), None)
    if b_in_A:
        r["model"] = {k: v for k, v in r["model"].items() if k not in b_in_A["model"]}
    res.append(r)
res

Output:

[{'Number': 123, 'brand': 'Ford', 'model': {'Mustang5': '1.64'}},
 {'Number': 321,
  'brand': 'Toyota',
  'model': {'Corolla1': '1.00', 'Prius1': '3.00'}},
 {'Number': 111,
  'brand': 'Honda',
  'model': {'Accord1': '2.64', 'Civic1': '3.00'}},
 {'Number': 891, 'brand': 'Ford', 'model': {}},
 {'Number': 745,
  'brand': 'Toyota',
  'model': {'Camry2': '2.64', 'Sienna2': '3.00'}},
 {'Number': 325,
  'brand': 'Honda',
  'model': {'HR-V2': '1.00', 'Passport2': '3.00'}},
 {'Number': 745, 'brand': 'Accura', 'model': {'MDX2': '3.00'}},
 {'Number': 325, 'brand': 'Accura', 'model': {'TLX1': '2.64'}}]
Sign up to request clarification or add additional context in comments.

8 Comments

I am getting error ` r = dict(b)` ValueError: dictionary update sequence element #0 has length 1; 2 is required. When I am running this code: A = open(path1) B = open(path2) res = [] for b in B: ` r = dict(b)` b_in_A = next((a for a in A if b["Number"] == a["Number"] and b["brand"] == a["brand"]), None)` if b_in_A: r["model"] = {k: v for k, v in r["model"].items() if k not in b_in_A["model"]} res.append(r)`
I can just guess. in my code A and B are lists of dictionaries, while your A and B are TextwIOWrapper (dict is parsing a string in your code). use import json module and do A = json.load(open(path1)) instead
I used A = json.load(open(path1)) And now I am getting json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 119)
this probably means your file is not a well formatted json. Post an example of your file
Updated, Checkout B.json.
|
0

jsondiff appears to just compare to individual json objects to find differences in their keys. You appear to be trying to just get a list of json objects not in both lists. This could just be done by looping through and checking to see if the object is in both lists:

unique_cars = [car for car in jf if car not in jg]
for car in jg:
    if car not in jf:
        unique_cars.append(car)

There's likely a stdlib that handles this sort of thing. But that's the logic you want.

1 Comment

I am trying to get keys present in B.json (model) but not in A.json. And this should be grouped with Number and brand.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.