5

I have the following dataframe which contains Parent child relation:

data = pd.DataFrame({'Parent':['a','a','b','c','c','f','q','z','k'],
                      Child':['b','c','d','f','g','h','k','q','w']})
a
├── b
│   └── d
└── c
    ├── f
    │   └── h
    └── g
z
└── q
    └── k
        └── w

I would like to get a new dataframe which contains e.g. all children of parent a:

child level1 level2 level x
d a b -
b a - -
c a - -
f a c -
h a c f
g a c -

I do not know how many levels there are upfront therefore I have used 'level x'.

I guess I somehow need a recursive pattern iterate over the dataframe.

1
  • 1
    based off of the line of code you posted with the dictionaries for the dataframe, how is one supposed to know that 'd' is a child of 'b'? I see it in your diagram, but how does the data you have or that is being input showing that relationship? ah, nvm I see it now - the first parent is the parent of the first child, the second parent is the parent of the second child, etc. so d is the fourth child, and so is the child of the fourth parent Commented Jun 1, 2021 at 20:38

1 Answer 1

4

I'd suggest

  • building each children:parentList
  • build the DataFrame with giving each parent a level name
import pandas as pd

values = {'Parent': ['a', 'a', 'b', 'c', 'c', 'f', 'q', 'z', 'k'],
          'Child': ['b', 'c', 'd', 'f', 'g', 'h', 'k', 'q', 'w']}

relations = dict(zip(values['Child'], values['Parent']))

def get_parent_list(element):
    parent = relations.get(element)
    return get_parent_list(parent) + [parent] if parent else []

all_relations = {
    children: {f'level_{idx}': value for idx, value in enumerate(get_parent_list(children))}
    for children in set(values['Child'])
}

df = pd.DataFrame.from_dict(all_relations, orient='index')
print(df)


  level_0 level_1 level_2
b       a     NaN     NaN
f       a       c     NaN
d       a       b     NaN
g       a       c     NaN
h       a       c       f
q       z     NaN     NaN
k       z       q     NaN
w       z       q       k
c       a     NaN     NaN
Sign up to request clarification or add additional context in comments.

2 Comments

I'm a little confuse with this line: children: {f'level_{idx}': value for idx, value in enumerate(get_parent_list(children))}, I know is list comprehension but how it's possible to pass children to get_parent_list when it's being defined in the same line?
@juanjoseperezhernandez "same line" is the purpose of dict/list comprehension (2d dict comprehension here), the children is defined in for children in set(values['Child']) . If it were a classic loop you'd have that line then the line using it

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.