Dataframe to Nested Dictionaries in Python

Question

Having a bit of trouble here.. I need to take a dataframe

import pandas as pd

region = ['A','A','A','B','B','B']
sub_region = ['1','2','2','3','3','4']
state = ['a','b','c','d','e','f']

pd.DataFrame({"region":region,"sub_region":sub_region,"state":state})

and convert into a nested dictionary with the following format:

[{name: "thing", children: [{name:"sub_thing",children:[{...}] }]}]

so a list of nested dictionaries where the key value pairs are always name:"", children:[{}], but childless children don't have children in their dict.. so the final desired output would be...

[{"name":"A",
    "children":[{"name":"1","children":[{"name":"a"}]},
                {"name":"2","children":[{"name":"b"},{"name":"c"}]}]
 },
 {"name":"B",
    "children":[{"name":"3","children":[{"name":"d"},{"name":"e"}]},
                {"name":"4","children":[{"name":"f"}]}]
 }
]

Assume a generalized framework where the number of levels can vary.

Ben Grossmann · Accepted Answer · 2022-10-13 20:12:46Z

1

I don't think you can do better than looping through the rows of the dataframe. That is, I don't see a way to vectorize this process. Also, if the number of levels can vary within the same dataframe, then the update function should be modified to handle nan entries (e.g. adding and not np.isnan(row[1]) to if len(row) > 1).

That said, I believe that the following script should be satisfactory.

import pandas as pd

region = ['A','A','A','B','B','B']
sub_region = ['1','2','2','3','3','4']
state = ['a','b','c','d','e','f']

df = pd.DataFrame({"region":region,"sub_region":sub_region,"state":state})
ls = []

def update(row,ls):
    for d in ls:
        if d['name'] == row[0]:
            break
    else:
        ls.append({'name':row[0]})
        d = ls[-1]
    if len(row) > 1:
        if not 'children' in d:
            d['children'] = []
        update(row[1:],d['children'])

for _,r in df.iterrows():
    update(r,ls)

print(ls)

The resulting list ls:

[{'name': 'A',
  'children': [{'name': '1', 'children': [{'name': 'a'}]},
   {'name': '2', 'children': [{'name': 'b'}, {'name': 'c'}]}]},
 {'name': 'B',
  'children': [{'name': '3', 'children': [{'name': 'd'}, {'name': 'e'}]},
   {'name': '4', 'children': [{'name': 'f'}]}]}]

Here's a version where childless children have 'children':[] in their dict, which I find a bit more natural.

import pandas as pd

region = ['A','A','A','B','B','B']
sub_region = ['1','2','2','3','3','4']
state = ['a','b','c','d','e','f']

df = pd.DataFrame({"region":region,"sub_region":sub_region,"state":state})
ls = []

def update(row,ls):
    if len(row) == 0:
        return
    for d in ls:
        if d['name'] == row[0]:
            break
    else:
        ls.append({'name':row[0], 'children':[]})
        d = ls[-1]
    update(row[1:],d['children'])

for _,r in df.iterrows():
    update(r,ls)

print(ls)

The resulting list ls:

[{'name': 'A',
  'children': [{'name': '1', 'children': [{'name': 'a', 'children': []}]},
   {'name': '2',
    'children': [{'name': 'b', 'children': []},
     {'name': 'c', 'children': []}]}]},
 {'name': 'B',
  'children': [{'name': '3',
    'children': [{'name': 'd', 'children': []},
     {'name': 'e', 'children': []}]},
   {'name': '4', 'children': [{'name': 'f', 'children': []}]}]}]

edited Oct 13, 2022 at 20:12

answered Oct 13, 2022 at 15:48

Ben Grossmann

5,0471 gold badge15 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Zach Kramer Over a year ago

bless you! worked like a charm. thankfully I don't need to parse this on the fly, so vectorization is not a priority at all.

Ben Grossmann Over a year ago

@Zach Glad to hear it! If that's everything you were looking for, I'd appreciate it if you would "accept" the answer by clicking the check mark (✓) underneath the vote arrows on my answer.

Collectives™ on Stack Overflow

Dataframe to Nested Dictionaries in Python

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related