Suppose I have a string s = '{aaaa{bc}xx{d{e}}f}', which has a structure of nested lists. I would like to have an hierarchical representation for it, while being able to access the sub-strings corresponding to the valid sub-lists. For simplicity, let's forget about the hierarchy, and I just want a list of sub-strings corresponding to valid sub-lists, something like:
['{aaaa{bc}xx{d{e}}f}', '{bc}', '{d{e}}', '{e}']
Using nestedExpr, one can obtain the nested structure, which includes all valid sub-lists:
import pyparsing as pp
s = '{aaaa{bc}xx{d{e}}f}'
not_braces = pp.CharsNotIn('{}')
expr = pp.nestedExpr('{', '}', content=not_braces)
res = expr('L0 Contents').parseString(s)
print(res.dump())
prints:
[['aaaa', ['bc'], 'xx', ['d', ['e']], 'f']]
- L0 Contents: [['aaaa', ['bc'], 'xx', ['d', ['e']], 'f']]
[0]:
['aaaa', ['bc'], 'xx', ['d', ['e']], 'f']
[0]:
aaaa
[1]:
['bc']
[2]:
xx
[3]:
['d', ['e']]
[0]:
d
[1]:
['e']
[4]:
f
In order to obtain the original string representation for a parsed element, I have to wrap it into pyparsing.originalTextFor(). However, this will remove all sub-lists from the result:
s = '{aaaa{bc}xx{d{e}}f}'
not_braces = pp.CharsNotIn('{}')
expr = pp.nestedExpr('{', '}', content=not_braces)
res = pp.originalTextFor(expr)('L0 Contents').parseString(s)
print(res.dump())
prints:
['{aaaa{bc}xx{d{e}}f}']
- L0 Contents: '{aaaa{bc}xx{d{e}}f}'
In effect, the originalTextFor() wrapper flattened out everything that was inside it.
The question. Is there an alternative to originalTextFor() that keeps the structure of its child parse elements? (It would be nice to have a non-discarding analogue, which could be used for creation of named tokens for parsed sub-expressions)
Note that scanString() will only give me the level 0 sub-lists, and will not look inside. I guess, I could use setParseAction(), but the mode of internal operation of ParserElement's is not documented, and I haven't had a chance to dig into the source code yet. Thanks!
Update 1. Somewhat related: https://stackoverflow.com/a/39885391/11932910 https://stackoverflow.com/a/17411455/11932910