1

I have a JSON like this:

{
    "department":"Data & Analytics",
    "child":[
        {
            "department":"Data Enginnering",
            "child": [
                {"department":"AWS Squad"},
                {"department":"GCP Squad"}
                ..
                    ..so..
                        ..on..
                            ..so..
                                ..forth..
                                    ..
            ]
        },
        {
            "department":"Data Science"
        }
    ]
}

I need to load it in BigQuery so what I am looking for is to transform it in something like the code below before:

[
    {
        "department":"Data & Analytics",
        "child":["Data Enginnering", "Data Science"]
    },
    {
        "department":"Data Enginnering",
        "child":["AWS Squad", "GCP Squad"]
    },
    {
        "department":"Data Science"
    },
    {
        "department": "AWS Squad"
    },
    {
        "department": "GCP Squad"
    }
]

But i got stuck trying

2
  • 2
    Is the original JSON basically a recursive data structure i.e. the children have the same structure as the parent, or does it end at 2 level deep? Why does "Data Engineering" get an object added to the final list, but not "Other" or "Sales"? Commented Jun 30, 2022 at 2:27
  • It is recursive without an established maximum level, don't mind about the labels, it was just figurative, I will change for better comprehension. Commented Jun 30, 2022 at 2:31

2 Answers 2

3

For a non-recursive approach, you can use the standard breadth-first traversal of using a queue and pushing the children into it.

from collections import deque

def flatten(data):
    q = deque([data])

    while q:
        current = q.popleft()
        d = {"department": current['department']}
        
        for child in current.get('child', []):
            d.setdefault('child', []).append(child['department'])
            q.append(child)
        
        yield d

                
list(flatten(data))

Which will give you:

[{'department': 'Data & Analytics',
  'child': ['Data Enginnering', 'Data Science']},
 {'department': 'Data Enginnering', 'child': ['AWS Squad', 'GCP Squad']},
 {'department': 'Data Science'},
 {'department': 'AWS Squad'},
 {'department': 'GCP Squad'}]

It's a subtle change in order from the recursive approach which will be depth first.

Sign up to request clarification or add additional context in comments.

Comments

1

Since the data is recursive, this can be solved using recursion.

def convert(data, output):
    department = data["department"]
    children = data.get("child")

    new_object = {"department": department}
    output.append(new_object)

    if children:
        new_object["child"] = [convert(child, output) for child in children]
    
    return department

It would be used like this

test_data = {
    "department":"Data & Analytics",
    "child":[
        {
            "department":"Data Enginnering",
            "child": [
                {"department":"Other"},
                {"department":"Sales"}
            ]
        },
        {
            "department":"Data Science"
        }
    ]
}

output = []
convert(test_data, output)
# convert output to json and send to BigQuery...

For the above example, the result is

[
    {
        "department": "Data & Analytics",
        "child": [
            "Data Enginnering",
            "Data Science"
        ]
    },
    {
        "department": "Data Enginnering",
        "child": [
            "Other",
            "Sales"
        ]
    },
    {
        "department": "Other"
    },
    {
        "department": "Sales"
    },
    {
        "department": "Data Science"
    }
]

It is not quite the same as your example output, but it's unclear from that example why some departments get an object added to the main list, and others don't.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.