1

I want to get a list of all trololo tags with attr attribute (but not xxx or any other) using Python from following XML:

<data>
    <test>
        <trololo attr="1">
        </trololo>
    </test>
    <test>
        <trololo>
        </trololo>
    </test>
    <test>
        <trololo attr="X">
        </trololo>
    </test>
    <test>
        <xxx attr="Y">
        </xxx>
    </test>
</data>

I've tried using //*[@attr], but result includes xxx tag as well. All other variations I tried are failing so far.

The actual Python code I'm using:

import xml.etree.ElementTree as ET
from pprint import pprint

tree  = ET.parse('test.xml')
nodes = tree.findall('//*trololo[@attr]')

pprint(nodes)

Output:

[]

UPDATE:

I've found out this was a namespace problem, which makes this question a duplicate. The problem was I had a root node looking like this:

<data xmlns="http://example.com">
7
  • Please note, I'm not aware of actual depth of the <trololo> nodes. They might be 100 levels below the root. Commented Nov 16, 2017 at 10:29
  • Which python version are you using? Commented Nov 16, 2017 at 10:32
  • I have run the same from my terminal, and I got the output [<Element 'trololo' at 0x7fab55c90ef8>, <Element 'trololo' at 0x7fab55c903b8>] Commented Nov 16, 2017 at 10:33
  • @FarhanK I'm using Python 3. Commented Nov 16, 2017 at 10:34
  • 2
    nodes = tree.findall('//trololo[@attr]') i.e without *? Commented Nov 16, 2017 at 10:41

1 Answer 1

1

All elements by name with a named attribute

As @har07 correctly answers in the comments, the XPath

//trololo[@attr]

will select all trololo elements with an attr attribute (regardless of its value), as requested.

This string,

//*trololo[@attr]

is syntactically not an XPath expression at all but does resemble,

//*:trololo[@attr]

which is syntactically invalid under XPath 2.0 (but not XPath 1.0). It says to select trololol elements in any namespace. To disregard namespaces in XPath 1.0 (but you really shouldn't), use local-name():

//*[local-name() = 'trololo' and @attr]

Other variations

  • All elements with a named attribute: //*[@attr]
  • All elements with any attribute: //*[@*]
Sign up to request clarification or add additional context in comments.

1 Comment

I've found out this was indeed namespace problem. Thanks!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.