0

Every job item on the Stackoverflow RSS feed has certain tags, with the key "category".

Looking basically like this:

<category>scala</category>
<category>hadoop</category>
<category>apache-spark</category>
<category>hive</category>
<category>json</category>

I would like to use Feedparser, to put all tags into a list. Instead I always get just the first element. The Feedparser documentation mentioned entries[i].content, but I am unsure if that's the right approach, or how to use it in this case.

Here is my code:

import feedparser

rss_url = "https://stackoverflow.com/jobs/feed"
feed = feedparser.parse(rss_url)
items = feed["items"]

for item in items:
    title = item["title"]
    try:
        tags = []
        tags.append(item["category"])
        print(title + " " + str(tags))
    except:
        print("Failed")

1 Answer 1

2

category on feedparser items is basically an alias for the first element in the tags list, which is basically a list of more feedparser items, each with a term attribute that contains the tag name.

You can just access the terms directly:

categories = [t.term for t in item.get('tags', [])]

For your code that is:

for item in items:
    title = item["title"]
    categories = [t.term for t in item.get('tags', [])]
    print(title, ', '.join(categories))

See the entries[i].tags documentation.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.