I would like to use an xpath to get a list of list (or sequence of sequence) that groups extracted xml tags by parent element in order.
Here are my attempts so far using a minimal example..
import elementpath, lxml.etree
xml = '''<a>
<b c="1">
<d e="3"/>
<d e="4"/>
</b>
<b c="2">
<d e="5"/>
<d e="6"/>
</b>
</a>'''
tree = lxml.etree.fromstring(str.encode(xml))
xpath1 = '/a/b/d/@e'
xpath2 = 'for $b in (/a/b) return concat("[", $b/string-join(d/@e, ", "), "]")'
print('1:', elementpath.select(tree, xpath1))
print('2:', elementpath.select(tree, xpath2))
print('3:', [['3', '4'], ['5', '6']])
Which outputs..
1: ['3', '4', '5', '6']
2: ['[3, 4]', '[5, 6]']
3: [['3', '4'], ['5', '6']]
xpath1 returns a flattened list/sequence, with no grouping by parent element.
xpath2 is the closest I have come so far, but gives sub-arrays as string rather than array.
option 3 is what I am after
Anyone able to advise on a better way of doing this with just an xpath?
Thanks, Mark
[x.xpath('d/@e') for x in tree.xpath('/a/b')]gives you what you want.