0

Trying to parse XML and to store it in json format. I want to do dynamically,so that for different api keys it will also work. How to change this to dynamically to retrieve data from XML and pass it in JSON format? I need details inside PNRAmount in xml file.

Code: `


pnr = myroot.xpath("/Envelope/Body/q:SellResponse/r:BookingUpdateResponseData/r:Success/r:PNRAmount",namespaces=namespace)

balanceDue = pnr[0][0].text.strip()
AuthorizedBalanceDue = pnr[0][1].text.strip()
SegmentCount = pnr[0][2].text.strip()
PassiveSegmentCount = pnr[0][3].text.strip()
TotalCost = pnr[0][4].text.strip()
PointsBalanceDue = pnr[0][5].text.strip()
TotalPointCost = pnr[0][6].text.strip()
AlternateCurrencyBalanceDue = pnr[0][7].text.strip()


# for pnrDetails in pnr:
PNR = {
        "BalanceDue" : balanceDue,
        "AuthorizedBalanceDue" : AuthorizedBalanceDue,
        "SegmentCount": SegmentCount,
        "PassiveSegmentCount":PassiveSegmentCount,
        "TotalCost":TotalCost,
        "PointsBalanceDue": PointsBalanceDue,
        "TotalPointCost":TotalPointCost,
        "AlternateCurrencyBalanceDue":AlternateCurrencyBalanceDue

    }

print(PNR)

XML Code: `

<Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
    <Body>
        <SellResponse xmlns="http://schemas.navitaire.com/WebServices/ServiceContracts/BookingService">
            <BookingUpdateResponseData xmlns="http://schemas.navitaire.com/WebServices/DataContracts/Booking" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
                <Success>
                    <RecordLocator />
                    <PNRAmount type = "true">
                        <BalanceDue>
                            4880.00000
                        </BalanceDue>
                        <AuthorizedBalanceDue>
                            4880.00000
                        </AuthorizedBalanceDue>
                        <SegmentCount>
                            1
                        </SegmentCount>
                        <PassiveSegmentCount>
                            0
                        </PassiveSegmentCount>
                        <TotalCost>
                            4880.00000
                        </TotalCost>
                        <PointsBalanceDue>
                            0
                        </PointsBalanceDue>
                        <TotalPointCost>
                            0
                        </TotalPointCost>
                        <AlternateCurrencyBalanceDue>
                            0
                        </AlternateCurrencyBalanceDue>
                    </PNRAmount>
                </Success>
                <Warning i:nil="true" />
                <Error i:nil="true" />
                <OtherServiceInformations i:nil="true" xmlns:a="http://schemas.navitaire.com/WebServices/DataContracts/Common" />
            </BookingUpdateResponseData>
        </SellResponse>
    </Body>
</Envelope>

I've tried the static way, writing path for each tag and getting the value by .text method. I want to get the data dynamic way such that if key is different it will work.

Trying to do something like this

 vehicle_response = detail_root.xpath('/x:Envelope/x:Body/y:VehicleLocationDetailRsp',
                                         namespaces=optionstravel_vehicle_ns())[0]

counter_location = vehicle_response.xpath('y:LocationInfo',
                                              namespaces=optionstravel_vehicle_ns())[0].get('CounterLocation')

Then passing counter_location as value in json.

4
  • Did you try something like that? linuxhint.com/python_xml_to_dictionary Commented Dec 20, 2022 at 10:31
  • I want to know the method Commented Dec 20, 2022 at 10:37
  • The article axplains the process in detail. In generat the xmltodict.parse() method creates a nested OrderedDict out of your XLM - so will have your data ready somewhere in there. I am not sure if I follow your requirements though. Commented Dec 20, 2022 at 10:43
  • This method (XMLtoDict) takes to much time. I need more efficient one. Commented Dec 20, 2022 at 11:00

2 Answers 2

1

You can use SAX interface which is built-in in Python. SAX offers sort of event-based XML parsing mechanism. It is rather helpful in your case since you want to parse only particular section of XML document. One has to implement a custom ContentHandler class to succeed.

Based on that tutorial, below is a handler class and respective code to parse your XML file into JSON file:

from xml.sax.handler import ContentHandler
from xml.sax import parse
import json
from collections import OrderedDict

class MyXmlPayloadHandler(ContentHandler):
    def __init__(self, section = ""):
        super().__init__()
        self.desired_section_name = section  # target section of XML doc
        self.desired_section_ongoing = False
        self.current_element = ""  # currently ongoing key/property/element name
        self.output_dict = OrderedDict() # the actual dictionary which will be subjected to jsonification

    def startElement(self, name, attrs):
        """ Called by the parser when new XML element begins """
        if self.desired_section_ongoing:
            self.current_element = name
            self.output_dict[name] = ""
        if self.desired_section_name in name:
            self.desired_section_ongoing = True

    def endElement(self, name):
        """ Called by the parser when XML element ends """
        # todo: you might want to implement conversion to int here,
        # i.e. at the end of each element
        if self.desired_section_ongoing:
            if self.desired_section_name in name:
                self.desired_section_ongoing = False

    def characters(self, content):
        """ Called by the parser to add characters to the element's value """
        if content.strip() != "":
            if self.desired_section_ongoing:
                # add the characters to the current element value
                self.output_dict[self.current_element] += content

# do the job
handler = MyXmlPayloadHandler(section = "PNRAmount")
parse(r"examples\sample.xml", handler)

# dumping the dict into json file
with open("datadict.json", "w") as f:
    json.dump(handler.output_dict, f, indent=4)

Resulting datadict.json file:

{
    "BalanceDue": "4880.00000",
    "AuthorizedBalanceDue": "4880.00000",
    "SegmentCount": "1",
    "PassiveSegmentCount": "0",
    "TotalCost": "4880.00000",
    "PointsBalanceDue": "0",
    "TotalPointCost": "0",
    "AlternateCurrencyBalanceDue": "0"
}
Sign up to request clarification or add additional context in comments.

2 Comments

I think OP wants to do another way around.
@mzjn thanks, I elaborated on my answer a bit...
0

You can use pandas read_xml() for parsing the XML and write it to to_json(), Source:

import pandas as pd

xml = 'Gorantula.xml'
namespaces={"xmlns:s": "http://schemas.xmlsoap.org/soap/envelope/",
            "xmlns": "http://schemas.navitaire.com/WebServices/DataContracts/Booking",
            "xmlns:i": "http://www.w3.org/2001/XMLSchema-instance"}

df = pd.read_xml(xml, xpath="//xmlns:PNRAmount", namespaces=namespaces)
df1 = df.T

df1.to_json(r'Gorantula.json')
print(df1)

Output this file:

{"0":{"type":true,
    "BalanceDue":4880.0,
    "AuthorizedBalanceDue":4880.0,
    "SegmentCount":1,
    "PassiveSegmentCount":0,
    "TotalCost":4880.0,
    "PointsBalanceDue":0,
    "TotalPointCost":0,
    "AlternateCurrencyBalanceDue":0}}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.