0

I'm having an issue with my XML file. I would like to achieve the same as in: https://www.delftstack.com/howto/python/xml-to-csv-python/

However, my XML file looks a bit different, for example:

<students>
<student name="Rick Grimes" rollnumber="1" age="15"/>
<student name="Lori Grimes" rollnumber="2" age="16"/>
<student name="Judith Grimes" rollnumber="4" age="13"/>
</students>

The code specified in the link does not work with this formatting.

from xml.etree import ElementTree

tree = ElementTree.parse("input.xml")
root = tree.getroot()

for student in root:
    name = student.find("name").text
    roll_number = student.find("rollnumber").text
    age = student.find("age").text
    print(f"{name},{roll_number},{age}")

I have very little coding experience, so hoping someone on here can help me out.

Expected result:

Rick Grimes,1,15 Lori Grimes,2,16 Carl Grimes,3,14 Judith Grimes,4,13

Actual result:

AttributeError: 'NoneType' object has no attribute 'text'

1
  • search for student.attrib this will return a dictionary of key, values, not for 'studen.text', another point 'find()' will find only the first in the list of tags. You can use findall() instead this will return you a list or use `iter("tag.name")'. Commented Jan 12, 2023 at 17:40

4 Answers 4

1

text refers to the actual text of the tag. To make it clear:

<student> text here </student>

You don't have any here since your tags are autoclosing. What you are looking for is the tag attribute attrib: doc here

Something like this should help you get what you're looking for:

for student in root:
    print(student.attrib)
Sign up to request clarification or add additional context in comments.

Comments

0

You cannot get the text if there aren't any text to get. Instead you want to use .attrib[key] as you have the values as attributes.

I have modified your example so that it will work with your XML file.

from xml.etree import ElementTree

tree = ElementTree.parse("input.xml")
root = tree.getroot()

for student in root:
    name = student.attrib["name"]
    roll_number = student.attrib["rollnumber"]
    age = student.attrib["age"]
    print(f"{name},{roll_number},{age}")

I hope this will help you.

Comments

0
import io
from xml.etree import ElementTree

xml_string = """<students>
        <student name="Rick Grimes" rollnumber="1" age="15"/>
        <student name="Lori Grimes" rollnumber="2" age="16"/>
        <student name="Judith Grimes" rollnumber="4" age="13"/>
        </students>"""

file = io.StringIO(xml_string)
tree = ElementTree.parse(file)
root = tree.getroot()

result = ""
for student in root:
    result += f"{student.attrib['name']},{student.attrib['rollnumber']},{student.attrib['age']} "
print(result)

result

Rick Grimes,1,15 Lori Grimes,2,16 Judith Grimes,4,13 

Comments

0

For such easy structured XML you can use also the build in function from pandas in two lines of code:

import pandas as pd

df = pd.read_xml('caroline.xml', xpath='.//student')
csv = df.to_csv('caroline.csv', index=False)

# For visualization only
with open('caroline.csv', 'r') as f:
    lines = f.readlines()

for line in lines:
    print(line)

Output:

name,rollnumber,age
Rick Grimes,1,15
Lori Grimes,2,16
Judith Grimes,4,13

With the option header=False you can also switch off to write the header to the csv file.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.