How to write data from multiple xml tags into multiple columns in csv?

Question

I am trying to take data from API call that returns XML object and parse few data points into a csv file with each object in its own column.

The XML looks like this:

<?xml version="1.0" encoding="utf-8" ?>

<YourMembership_Response>
<Items>
<Item>
<ItemID></ItemID>
<ID>92304823A-2932</ID>
<WebsiteID>0987</WebsiteID>
<NamePrefix></NamePrefix>
<FirstName>John</FirstName>
<MiddleName></MiddleName>
<LastName>Smith</LastName>
<Suffix></Suffix>
<Nickname></Nickname>
<EmployerName>abc company</EmployerName>
<WorkTitle>manager</WorkTitle>
<Date>3/14/2013 2:12:39 PM</Date>
<Description>Removed from group by Administration.</Description>
</Item>
<Item>
<ItemID></ItemID>
<ID>92304823A-2932</ID>
<WebsiteID>0987</WebsiteID>
<NamePrefix></NamePrefix>
<FirstName>John</FirstName>
<MiddleName></MiddleName>
<LastName>Smith</LastName>
<Suffix></Suffix>
<Nickname></Nickname>
<EmployerName>abc company</EmployerName>
<WorkTitle>manager</WorkTitle>
<Date>3/14/2013 2:12:39 PM</Date>
<Description>Removed from group by Administration.</Description>
</Item>

I have written this code to write just IDs into CSV, which works fine.

with open("output1.csv", "wb") as f:
    writer = csv.writer(f)
    for node in tree.findall('.//ID'):
        writer.writerow([node.text])

Now when I attempting to write multiple data points into csv, the machine is simply appending the data points into one column. This the code here I have been attempting with:

with open("test1.csv", "wb") as f:
    writer = csv.writer(f)
    for node in tree.findall('.//ID'):
        writer.writerow([node.text])
    for node in tree.findall('.//FirstName'):
        writer.writerow([node.text])
    for node in tree.findall('.//LastName'):
        writer.writerow([node.text])

I need the data to look like this in the csv with other data points of choosing later on, what am I doing wrong?:

ID                    FirstName     LastName
92304823A-2932         John           Smith

Thank you in advance.

I dont have an answer for the input size, but there are roughly 15000 members I have to do this for. — RustyShackleford
– RustyShackleford, Commented Sep 12, 2017 at 19:40

Bill Bell · Accepted Answer · 2017-09-12 19:54:47Z

This is, in essence, how to collect the data.

>>> from xml.etree import ElementTree
>>> tree = ElementTree.parse('api.xml')
>>> tree.findall('.//Item')
[<Element 'Item' at 0x0000000006679EA8>, <Element 'Item' at 0x0000000006681318>]
>>> for item in tree.findall('.//Item'):
...     item.find('ID').text, item.find('FirstName').text, item.find('LastName').text
... 
('92304823A-2932', 'John', 'Smith')
('92304823A-2932', 'John', 'Smith')

In contrast, when you use a construct like tree.findall('.//ID') you are asking the xpath engine to start with tree (that's the '.' part) and look down through the branches for all occurences of 'ID' at once. This means that, in you sample xml code you will get a set of two IDs which won't even necessarily be in the original order. What you need to do is, first find all of the Item entries, then find the three corresponding data pieces of interest for that Item.

Addendum:

>>> import csv
>>> with open('api.csv', 'w', newline='') as csvfile:
...     fieldnames = ['ID', 'FirstName', 'LastName']
...     writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
...     writer.writeheader()
...     for item in tree.findall('.//Item'):
...         writer.writerow({
...             'ID': item.find('ID').text,
...             'FirstName': item.find('FirstName').text,
...             'LastName': item.find('LastName').text})

Resulting output file:

ID,FirstName,LastName
92304823A-2932,John,Smith
92304823A-2932,John,Smith

Collectives™ on Stack Overflow

How to write data from multiple xml tags into multiple columns in csv?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related