1

I've got a question about parsing a rather complicated XML document in Python with xml.etree.ElementTree. The XML is scap-security-guide-0.1.75/ssg-ubuntu2204-ds.xml from https://github.com/ComplianceAsCode/content/releases/download/v0.1.75/scap-security-guide-0.1.75.zip and the root tag and its attributes are:

<ds:data-stream-collection xmlns:cat="urn:oasis:names:tc:entity:xmlns:xml:catalog" xmlns:cpe-dict="http://cpe.mitre.org/dictionary/2.0" xmlns:cpe-lang="http://cpe.mitre.org/language/2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:ds="http://scap.nist.gov/schema/scap/source/1.2" xmlns:html="http://www.w3.org/1999/xhtml" xmlns:ind="http://oval.mitre.org/XMLSchema/oval-definitions-5#independent" xmlns:linux="http://oval.mitre.org/XMLSchema/oval-definitions-5#linux" xmlns:ocil="http://scap.nist.gov/schema/ocil/2.0" xmlns:oval="http://oval.mitre.org/XMLSchema/oval-common-5" xmlns:oval-def="http://oval.mitre.org/XMLSchema/oval-definitions-5" xmlns:unix="http://oval.mitre.org/XMLSchema/oval-definitions-5#unix" xmlns:xccdf-1.2="http://checklists.nist.gov/xccdf/1.2" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" id="scap_org.open-scap_collection_from_xccdf_ssg-ubuntu2204-xccdf.xml" schematron-version="1.3">

When I load the document with ET.parse(...).getroot() and look at the root element, I can only see the attributes without a namespace:

id='scap_org.open-scap_collection_from_xccdf_ssg-ubuntu2204-xccdf.xml'
schematron-version='1.3'

I don't really need the other attributes but I'm curious why I don't get them all. What if I needed one of the other attributes? How would I access them?

2
  • xmlns:something are namespace declaration themselves, not attributes. Try from lxml import etree tree = etree.parse("/home/lmc/tmp/soap.xml") tree.xpath('//namespace::*') Commented Dec 5, 2024 at 13:04
  • stackoverflow.com/a/71801408/2834978 Commented Dec 5, 2024 at 13:11

1 Answer 1

0

Those other "attributes" are namespace declarations and are "reserved attributes" that behave a little differently.

A namespace (or more precisely, a namespace binding) is declared using a family of reserved attributes. Such an attribute's name must either be xmlns or begin xmlns:. These attributes, like any other XML attributes, may be provided directly or by default.

There are no attribute nodes corresponding to attributes that declare namespaces. In XPath they are referred to as namespace nodes and can be selected using the namespace:: axis:

/*/namespace::*
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.