0

i am trying to get the text of an element using mini dom, in the following code , i have also tried getText() Method as suggested here , but i am unable to get the desired output, following is my code. I dont get the Text value from the element i am trying to work on.

import xml.dom.minidom

doc = xml.dom.minidom.parse("DL_INVOICE_DETAIL_TCB.xml")
results = doc.getElementsByTagName("G_TRANSACTIONS")
def getText(nodelist):
    rc = []
    for node in nodelist:
        if node.nodeType == node.TEXT_NODE:
            rc.append(node.data)
    return ''.join(rc)
for result in results:
    for element in result.getElementsByTagName("INVOICE_NUMBER"):
        print(element.nodeType)
        print(element.nodeValue)

Following is my XML sample

<LIST_G_TRANSACTIONS>
    <G_TRANSACTIONS>
        <INVOICE_NUMBER>31002</INVOICE_NUMBER>
        <TRANSACTION_CLASS>Invoice</TRANSACTION_CLASS>
    </G_TRANSACTIONS>
</LIST_G_TRANSACTIONS>

I am using the following

2 Answers 2

1

A minidom based answer

from xml.dom import minidom

xml = """\
<LIST_G_TRANSACTIONS>
    <G_TRANSACTIONS>
        <INVOICE_NUMBER>31002</INVOICE_NUMBER>
        <TRANSACTION_CLASS>Invoice1</TRANSACTION_CLASS>
    </G_TRANSACTIONS>
    <G_TRANSACTIONS>
        <INVOICE_NUMBER>31006</INVOICE_NUMBER>
        <TRANSACTION_CLASS>Invoice2</TRANSACTION_CLASS>
    </G_TRANSACTIONS>    
</LIST_G_TRANSACTIONS>"""

dom = minidom.parseString(xml)
invoice_numbers = [int(x.firstChild.data) for x in dom.getElementsByTagName("INVOICE_NUMBER")]
print(invoice_numbers)

output

[31002, 31006]
Sign up to request clarification or add additional context in comments.

Comments

0

If using ElementTree is fine with you, here is the code:

import xml.etree.ElementTree as ET

xml = '''<LIST_G_TRANSACTIONS>
    <G_TRANSACTIONS>
        <INVOICE_NUMBER>31002</INVOICE_NUMBER>
        <TRANSACTION_CLASS>Invoice1</TRANSACTION_CLASS>
    </G_TRANSACTIONS>
    <G_TRANSACTIONS>
        <INVOICE_NUMBER>31006</INVOICE_NUMBER>
        <TRANSACTION_CLASS>Invoice2</TRANSACTION_CLASS>
    </G_TRANSACTIONS>    
</LIST_G_TRANSACTIONS>'''

root = ET.fromstring(xml)
invoice_numbers = [entry.text for entry in list(root.findall('.//INVOICE_NUMBER'))]
print(invoice_numbers)

output

['31002', '31006']

2 Comments

Thanks for the reply, but i need it via mini dom
Just to let you know, ElementTree is a core python package and not an external package (just like minidom)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.