I am using the xml.etree.ElementTree module to parse an XML file, returning the attributes into lists, and then entering those lists in a MySQL database (this last step I am not worried about so there is no need to cover it here). Simple enough, and I am currently able to do so but only for one child node at a time. The goal is to do so with multiple child nodes, regardless of how many there are. Here is a sample file:
<?xml version="1.0"?>
<catalog>
<book id="bk101" type="hardcover">
<info author="Gambardella, Matthew" title="XML Developer's Guide" genre="Computer" price="44.95" publish_date="2000-10-01" description="An in-depth look at creating applications
with XML." />
</book>
<book id="bk102" type="softcover">
<info author="Ralls, Kim" title="Midnight Rain" genre="Fantasy" price="5.95" publish_date="2000-10-01" description="A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world." />
</book>
<book id="bk101" type="softcover">
<info author="Corets, Eva" title="Maeve Ascendant" genre="Fantasy" price="5.95" publish_date="2000-11-17" description="After the collapse of a nanotechnology
society in England, the young survivors lay the
foundation for a new society." />
</book>
</catalog>
I am able to parse the correct attributes for the first book node where id="bk101" or the last book node where id="bk103" by returning a list with the correct attributes. However, I am only returning one list per file when I need to return multiple lists (one for each book node and info node, so in this case 6 total lists).
Here is my code:
import xml.etree.ElementTree
book_attribute = ['id', 'type']
info_attribute = ['author', 'title', 'genre', 'price', 'publish_date', 'description']
class ApplicationClass(object): # define the only class in this file
def __init__(self):
self.ET = xml.etree.ElementTree.parse('file.xml').getroot()
self.bookNodes = self.ET.findall('book')
self.book_values_list = []
self.info_values_list = []
def get_book(self):
for bookNode in self.bookNodes:
self.book_values_list = [bookNode.get(i) for i in book_attribute]
return self.book_values_list
def get_info(self):
for bookNode in self.bookNodes:
for infoNode in bookNode.findall('info'):
self.info_values_list = [infoNode.get(i) for i in info_attribute]
return self.info_values_list
a = ApplicationClass()
a.get_book()
print(a.book_values_list)
a.get_info()
print(a.info_values_list)
So I know my problem is that I am only returning one list per function because I am returning the list at the end of the function and then calling the function at the end of my script. I just can't find the proper way to achieve my desired outcome. If I don't run my functions at the end of the script, then how can I return the multiple lists that I am looking for?