0

I have a html file say:

<html>...  
  <li id="123"></li>
  <li id="3455"></li>
  ....
</html>

how do I get the value for all the ids alone in python using BeautifulSoup ? the desired output is : ["123","3455"]

2 Answers 2

2

To get the list you want, use a list comprehension. It can be done in one line as follows (last line):

html = '<html> <li id="123"></li><li id="3455"></li> </html>'
soup = BeautifulSoup(html)

attrs = [li['id'] for li in soup.find_all('li')]
Sign up to request clarification or add additional context in comments.

2 Comments

You're welcome @Abhinav - I saw your edit, but find_all is the current syntax for BeautifulSoup, and findAll for BS 3 - but it works in BS 4 as well.
Ok ! i am using bs3 i guess... so it didn't work for me, good to know that.
0
from BeautifulSoup import BeautifulSoup

foo = '<html> <li id="123"> </li> <li id="3455"></li> </html>'

soup = BeautifulSoup(foo)

for id in soup.html.findAll('li'):
    print id['id']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.