2

I'm trying to create a large XML file in memory that will be inserted into a Blob field in an ESRI feature class.

I attempted to use elementtree, but Python would eventually crash. I probably wasn't doing it the best way. An example of my code (not exact):

with update_cursor on feature class:
    for row in update_cursor:
        root = Element("root") 
        tree = ElementTree(root)
        for id in id_list:
            if row[0] in id:
               equipment = Element("equipment") 
               root.append(equipment)

               attrib1 = Element("attrib1")
               equipment.append(attrib1)
               attrib1.text = "myattrib1"

               attrib2 = Element("attrib2")
               equipment.append(attrib2)
               attrib2.text = "myattrib2"

               ....and about 5 more of these appended to equipment

        xml_data = ET.tostring(root)

        insert xml_data into blob field

Example of the XML:

<root>
  <equipment>
    <attrib1>One</attrib1>
    <attrib2>Two</attrib2>
    <attrib3>Three</attrib3>
    ...
    <attrib10>Ten</attrib10>
  </equipment>
  <equipment>
    <attrib1>One</attrib1>
    <attrib2>Two</attrib2>
    <attrib3>Three</attrib3>
    ...
    <attrib10>Ten</attrib10>
  </equipment>
</root>

Now I realize this is probably a pretty amateur way of doing this, but I'm not sure of the best way to build this XML in memory.

For each row in the update_cursor, there could be multiple "equipment" elements added to the root, and each "equipment" element will have the exact same children elements but with different attributes.

I ran this and there were about 200 ids that matched a single row, so it had to create the equipment element and all the children of the equipment 200 times in memory.

So what is the best way to create XML in memory with Python using a standard library?

3
  • 1
    It would help us greatly if you describe what the input looks like (i.e. the row and id_list). Commented Mar 11, 2014 at 16:58
  • The is working with spatial data and row is just grabbing the unique ID of the point and ID_List is just a list of IDs that match this unique ID. If the ID matches it fills in the XML with the attributes of the ID from the list. Each unique ID can have multiple matches from the ID_List, which represent equipment. Commented Mar 12, 2014 at 14:12
  • I'm just wondering if these is a better way to write the XML then I have here. Commented Mar 12, 2014 at 15:08

2 Answers 2

2
+25

Your data structure looks dead simple. Do not bother using an XML library. Just write your lines directly into a cStringIO.StringIO.

with update_cursor on feature class:
    for row in update_cursor:
        buffer = cStringIO.StringIO()
        buffer.write("<root>\n")
        for id in id_list:
            if row[0] in id:
               buffer.write("    <equipment>\n")
               buffer.write("        <attrib1>One</attrib1>\n")
               buffer.write("        <attrib2>Two</attrib2>\n")
               buffer.write("        <attrib3>Three</attrib3>\n")

               ....and about 5 more of these appended to equipment

               buffer.write("    </equipment>\n")

        buffer.write("</root>\n")

        xml_data = buffer.getvalue()

        insert xml_data into blob field
Sign up to request clarification or add additional context in comments.

2 Comments

Well I guess what I didn't mention is that I'm not always just creating this data from scratch. Sometimes there will be existing data in the blob and I will have to read it and add new equipment to it and rewrite the xml to the blob. But I didn't know about the cString which is very cool.
Either way, I think you will need something that is a bit tailored to your use case rather than a more general use XML tool. If you can use a non-bundled package, lxml might do better. If you can write a C extension and use a lighter weight DOM library, that also may be worthwhile. You can also give xml.etree.cElementTree a shot, but I am not sure its memory requirements are that much different. Another idea would be to work on smaller files you can concat at the end.
2

You can use ET.SubElement to create and append elements:

equipment = ET.SubElement(root, "equipment")
ET.SubElement(equipment, "attrib1").text = "One"
ET.SubElement(equipment, "attrib2").text = "Two"
ET.SubElement(equipment, "attrib3").text = "Three"
...

It is shorter and more clear.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.