0

So, in my last question I asked for help in parsing the links from XML in an RSS feed. Using the ideas I received from assistance here in combination with extra research, I was able to write up this:

def GetRSS(RSSurl):
    url_info = urllib.urlopen(RSSurl)
    if (url_info):
        xmldoc = minidom.parse(url_info)
    if (xmldoc):
        channel = xmldoc.getElementsByTagName('channel')
        for node in channel:
            item = xmldoc.getElementsByTagName('item')
            for node in item:
                alist = xmldoc.getElementsByTagName('link')
                for a in alist: 
                    linktext = a.firstChild.data
                    print linktext

As I mentioned in the other question, I wrote this for obtaining the links from the RSS feed on Redlettermedia.com. The code works fine and the output I receive is:

http://redlettermedia.com
http://redlettermedia.com/half-in-the-bag-b-fest-2012/
http://redlettermedia.com/an-update-from-red-letter-media/
http://redlettermedia.com/half-in-the-bag-red-tails/
http://redlettermedia.com/half-in-the-bag-the-devil-inside-and-flyin-ryan/
http://redlettermedia.com/newly-found-episode-iii-review-behind-the-scenes-footage/
http://redlettermedia.com/half-in-the-bag-the-girl-with-the-dragon-tattoo-and-2011-re-cap/
http://redlettermedia.com/mr-plinetts-indiana-jones-and-the-kingdom-of-the-crystal-skull-review/
http://redlettermedia.com/new-mr-plinkett-review-trailer/
http://redlettermedia.com/plinkett-fest/
http://redlettermedia.com/update/
http://redlettermedia.com
http://redlettermedia.com/half-in-the-bag-b-fest-2012/
http://redlettermedia.com/an-update-from-red-letter-media/
http://redlettermedia.com/half-in-the-bag-red-tails/
http://redlettermedia.com/half-in-the-bag-the-devil-inside-and-flyin-ryan/
http://redlettermedia.com/newly-found-episode-iii-review-behind-the-scenes-footage/

And so on. What I would like to do now is print only the newest update link as a result for a function (which is the second line in the output, "http://redlettermedia.com/half-in-the-bag-b-fest-2012/" in this case). How would I print only that line?

1
  • Can you install non-stdlib modules? How do you define newest update link? Commented Feb 9, 2012 at 5:29

1 Answer 1

1

If it's always the second item in the list you could try

url = xmldoc.getElementsByTagName('link')[1].firstChild.data
print url
Sign up to request clarification or add additional context in comments.

4 Comments

This works pretty much perfect, except that I receive ten lines repeating the url I was trying to get. What am I doing to cause that, as opposed to just receiving the url I wanted once?
It is because you're printing it for all items in the list. You would most likely replace what is after 'for node in item:' with my suggestion but I'm unable to test at the moment...
Well I figured that's what I should do, actually. I completely replaced everything beneath for node in item: with what you suggested, but I still seem to be getting ten lines for some reason.
Looking more closely you would probably put it directly after if (xmldoc):

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.