How to convert a file rss to xml in python?

Question

I need to convert the page cnn rss (http://rss.cnn.com/rss/edition.rss) to XML file. I need to filter with the tag: title, link and pubDate and then export to csv file the result.

I am tried a code but not work because the result omit the pubDate.

I use this code:

# Python code to illustrate parsing of XML files
# importing the required modules
import csv
import requests
import xml.etree.ElementTree as ET
def loadRSS():
# url of rss feed
url = 'http://rss.cnn.com/rss/edition.rss'
# creating HTTP response object from given url
resp = requests.get(url)
# saving the xml file
with open('topnewsfeed.xml', 'wb') as f:
f.write(resp.content)
def parseXML(xmlfile):
# create element tree object
tree = ET.parse(xmlfile)
# get root element
root = tree.getroot()
# create empty list for news items
newsitems = []
# iterate news items
for item in root.findall('./channel/item'):
# empty news dictionary
news = {}
# append news dictionary to news items list
newsitems.append(news)
# return news items list
return newsitems
def savetoCSV(newsitems, filename):
# specifying the fields for csv file
fields = ['title', 'pubDate', 'description', 'link', 'media']
# writing to csv file
with open(filename, 'w') as csvfile:
# creating a csv dict writer object
writer = csv.DictWriter(csvfile, fieldnames=fields)
# writing headers (field names)
writer.writeheader()
# writing data rows
writer.writerows(newsitems)
def main():
# load rss from web to update existing xml file
loadRSS()
# parse xml file
newsitems = parseXML('topnewsfeed.xml')
# store news items in a csv file
savetoCSV(newsitems, 'topnews.csv')
if __name__ == "__main__":
# calling main function
main()

i tryed to configure the parameters and the result is this:

CNN show the rss as web format not as xml, for example reddit:

any idea of how obtain this information?

Given the importance of indentation in python i think it would help a lot if you looked at the formatting of your post. — axwr
– axwr, Commented Jun 22, 2017 at 20:21

Robert Townley · Accepted Answer · 2017-06-22 20:38:52Z

1

The XML entry for the RSS feed you mentioned is pubdate, not pubDate with a capital D.

If the issue is that pubdate isn't being included, that might be part of the problem.

answered Jun 22, 2017 at 20:38

Robert Townley

3,6044 gold badges33 silver badges57 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

aaguirre Over a year ago

ok, this code have two parts, the first part save the xml and the second part work with this XML and create a CSV file with this information. in this moment i can create the xml but i have a error with create the CSV file.

Collectives™ on Stack Overflow

How to convert a file rss to xml in python?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related