2

I am web-scraping with Python and using BeutifulSoup library

I have HTML markup like this:

<tr class="deals" data-url="www.example2.com">
<span class="hotel-name">
<a href="www.example2.com"></a>
</span>
</tr>
<tr class="deals" data-url="www.example3.com">
<span class="hotel-name">
<a href="www.example3.com"></a>
</span>
</tr>

I want to get the data-url or the href value in all <tr>s. Better If I can get href value

Here is a little snippet of my relevant code:

main_url =  "http://localhost/test.htm"
page  = requests.get(main_url).text
soup_expatistan = BeautifulSoup(page)

print (soup_expatistan.select("tr.deals").data-url)
# or  print (soup_expatistan.select("tr.deals").["data-url"])

1 Answer 1

4

You can use tr.deals span.hotel-name a CSS Selector to get to the link:

from bs4 import BeautifulSoup

data = """
<tr class="deals" data-url="www.example.com">
<span class="hotel-name">
<a href="wwwexample2.com"></a>
</span>
</tr>
"""

soup = BeautifulSoup(data)
print(soup.select('tr.deals span.hotel-name a')[0]['href'])

Prints:

wwwexample2.com

If you have multiple links, iterate over them:

for link in soup.select('tr.deals span.hotel-name a'):
    print(link['href'])
Sign up to request clarification or add additional context in comments.

2 Comments

This works ... Actually I have multiple <tr>s in my Markup ... How do I iterate over all trs and find the a link?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.