2

I need to parse a directory of xml files into one large csv file, I need certain attributes under the element 'Param' (attributes are 'Name' and 'PNum'). There is another XML file in the directory called Content.xml which I can get all the names of the other XML files and set them as the FileName. The issue is that I cannot figure out how to get these attributes in each XML file as each XML file has a different organisation and some don't seem to have these attributes in the first place.

I have written code that works for one of the XML files in the directory that outputs a CSV file with all the relevant information.

import xml.etree.ElementTree as ET
import csv
import os

FileName = '------.xml'
tree = ET.parse(FileName)
root = tree.getroot()[4]

csv_out = open('CsvOut', 'w')

csvwriter = csv.writer(csv_out)

count = 0
for child in root:
    generation = []
    parameters = []
    if count == 0:
        csv_head = ['Generation', 'Parameter Name', 'Parameter Number']
        csvwriter.writerow(csv_head)
        count = count + 1

    gen = FileName[:-4]
    generation.append(gen)
    parameters.append(generation)
    name = child.get('Name')
    parameters.append(name)
    num = child.get('PNum')
    parameters.append(num)
    csvwriter.writerow(parameters)



csv_out.close()

1 Answer 1

1

I rather simple and you can do it in two steps:

  • First, enumerate all xml files in the directory
  • Perform your code over these files
import xml.etree.ElementTree as ET
import csv
import os
from glob import glob

# create csv writer
csv_out = open('CsvOut', 'w')
csvwriter = csv.writer(csv_out)
# write the header
csv_head = ['Generation', 'Parameter Name', 'Parameter Number']
csvwriter.writerow(csv_head)

# iterate over the xml files in the current directory
for FileName in glob("*.xml"):
    tree = ET.parse(FileName)
    root = tree.getroot()[4]
    for child in root:
        generation = []
        parameters = []

        gen = FileName[:-4]
        generation.append(gen)
        parameters.append(generation)
        name = child.get('Name')
        parameters.append(name)
        num = child.get('PNum')
        parameters.append(num)
        csvwriter.writerow(parameters)

# after iterating, close the csv file
csv_out.close()
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you, managed to figure it out just before you answered but hopefully this helps solve for others with the same problem

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.