I want to extract the name and d tags for each food item from the xml file.
I thought about making all the d tags to become children of name tag. And then looping over the contents of name. But not sure how to go about that or if there are other more efficient ways. Open to other solutions. I have some code but not there yet. Thank you!
## XML
<?xml version="1.0"?>
<breakfast_menu>
<food>
<name>Belgian Waffles</name>
<d>price 5.95</d>
<d>Two of our famous Belgian Waffles
with plenty of real maple syrup</d>
<d>650 cal</d>
<name>Belgian Waffles Light</name>
<d>price 5.15</d>
<d>Two of our famous Belgian Waffles with less calories</d>
<d>450 cal</d>
</food>
<food>
<name>Strawberry Belgian Waffles</name>
<d>price 7.95</d>
<d>Light Belgian waffles covered
with strawberries and whipped cream</d>
<d>900 cal</d>
</food>
<food>
<name>French Toast</name>
<d>price 4.50</d>
<d>Thick slices made from our
homemade sourdough bread</d>
<d>600 cal</d>
</food>
</breakfast_menu>
## My code
import xml.etree.ElementTree as ET
import pandas as pd
tree = ET.parse('xml_doc_txt.txt')
root = mytree.getroot()
[elem.tag for elem in root.iter()]
for node in root.iter('food'):
for name in node.findall('name'):
Name = name.text
for d in node.findall('d'):
description = node.findtext('d')
action = action.append(pd.DataFrame(data={'Name': Name, 'Description': description}, index = [0]), ignore_index = True)
df = pd.DataFrame(action, columns=['Name', 'Description'])
df
The desired df should have 2 columns like so:
| Name | Description |
| -----------------| --------------------------------------------- |
| Belgian Waffles | price 5.95,Two of our famous..., 650 cal|
| Belgian Waffles Light | price 5.15, Two of our famous..., 450 cal|
| Strawberry Belgian Waffles | price 7.95,Light Belgian waffles..., 900 cal|
...