1

So here's my problem:

I have successfully parsed a text file with line indention level in to a list like:

A = [[1,'a'],[1,'b'],[2,'c'],[2,'d'],[1,'e'],[2,'f']]

Each element in list A is a list of length 2. Each element corresponds to a line read from the text file. A[x][0] is the indent level of the line in the text file, A[x][1] is the content of the line where x is the index of any element in A.

For e.g. A[1] = [1,'b'] where 1 is the indent level and 'b' is the line text. A[2] and A[3] are children of A[1] i.e. sub indented lines.

I am trying to get an output list which will be in the following format:

B = [['a'],['b',['c','d']],['e',['f']]]

This way when I iterate over B[x][0] I will get only the first level indented items and be able to recursively go to each element.

The algorithm should be able to handle infinite depth i.e if A[3] was followed by element [3,'z'] it should be a nested list of A[3].

I have explored some other posts that solve a similar problem and use itertools.groupby but unfortunately haven't been able to understand them enough to be able to apply it to my problem.

Really appreciate your help folks!

3
  • 2
    Why is 'b' in a sublist, although it's on the same level as 'a'? Why isn't 'd' in its own sublist then? Could you clarify the rules? Commented Dec 29, 2012 at 11:05
  • -a -b --c --d -e --f So c and d are relevant to b and f is relevant to e. The root list will only contain elements from level 1 i.e. a, b and e but since b and e have children, they need to be if b i added as an element of the root list, the relationship with the children will be lost Commented Dec 29, 2012 at 11:16
  • 3
    It won't be lost: in ['a', 'b', ['c', 'd'], 'e', ['f']] each sublist is "relevant" to the previous element. Commented Dec 29, 2012 at 11:21

2 Answers 2

0

Try this simple stack-based algorithm:

A = [[1,'a'],[1,'b'],[2,'c'],[2,'d'],[1,'e'],[2,'f']]
stack = [ [] ]
for level, item in A:
    while len(stack) > level:
        stack.pop()
    while len(stack) <= level:
        node = (item, [])
        stack[-1].append(node)
        stack.append(node[1])

result = stack[0]

This creates a structure like:

[('a', []), ('b', [('c', []), ('d', [])]), ('e', [('f', [])])]

which, IMO, is more convenient to work with, but it should be no problem to convert it to yours if needed:

def convert(lst):
    return [ [x, convert(y)] if y else x for x, y in lst]

result = convert(stack[0])
print result
# ['a', ['b', ['c', 'd']], ['e', ['f']]]
Sign up to request clarification or add additional context in comments.

2 Comments

A = [[1,'a'],[3,'b']] isn't working properly. Changing line node = (item, []) with node = ((item if len(stack) == level else None), []) repairs it.
thanks! this is pretty much what I wanted to go for. How can we get the output without the empty lists?
0

Recursive solution, method returns formated list for a part of an input list that is for and below given level. Format is like Lev described, since it is consistent. Note: method destructs input list.

A = [[1,'a'],[1,'b'],[2,'c'],[2,'d'],[4,'x'],[5,'y'],[1,'e'],[2,'f']]

def proc_indent(level, input_list):
  if not input_list or level > input_list[0][0]:
    return None
  this_level = input_list.pop(0)[1] if level == input_list[0][0] else None
  up_levels = []
  while True:
    r = proc_indent(level+1, input_list)
    if r is None:
      break
    up_levels.append( r )
  if not up_levels:
    return [this_level]
  up_levels = [i for u in up_levels for i in u]
  return [this_level, up_levels] if this_level else [up_levels]

print proc_indent(0, list(A))  # copy list, since it is destructed in a recursion

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.