I have got a tree structure file,where parenthesis are used to represent the tree. Here is the code toconvert the same to a python nested list
def foo(s):
def foo_helper(level=0):
try:
token = next(tokens)
except StopIteration:
if level != 0:
raise Exception('missing closing paren')
else:
return []
if token == ')':
if level == 0:
raise Exception('missing opening paren')
else:
return []
elif token == '(':
return [foo_helper(level+1)] + foo_helper(level)
else:
return [token] + foo_helper(level)
tokens = iter(s)
return foo_helper()
as given by how to parse a strign and return nested array.
Here it works fine, when the characters are 1 in length. for words or sentences, the same doesnt properly work. My sample of tree is:
( Satellite (span 69 74) (rel2par Elaboration)
( Nucleus (span 69 72) (rel2par span)
( Nucleus (span 69 70) (rel2par span)
( Nucleus (leaf 69) (rel2par span) (text _!MERRILL LYNCH READY ASSETS TRUST :_!) )
( Satellite (leaf 70) (rel2par Elaboration) (text _!8.65 % ._!) )
)
( Satellite (span 71 72) (rel2par Elaboration)
( Nucleus (leaf 71) (rel2par span) (text _!Annualized average rate of return_!) )
( Satellite (leaf 72) (rel2par Temporal) (text _!after expenses for the past 30 days ;_!) )
)
)
( Satellite (span 73 74) (rel2par Elaboration)
( Nucleus (leaf 73) (rel2par span) (text _!not a forecast_!) )
( Satellite (leaf 74) (rel2par Elaboration) (text _!of future returns ._!) )
)
)
here, the output need to be
['satellite',['span','69','74'].........] but with the given function what I get is ['s','a','t'...............['s','p','a','n','7','3']..............]
How can this be modified?
foo_helperknowtokens? I would've thought that aNameErrorwould've been raised, sincetokensis defined below the definition offoo_helper.foobasically.