1

In a string containing HTML I have several links that I want to replace with the pure href value:

from bs4 import BeautifulSoup
a = "<a href='www.google.com'>foo</a> some text <a href='www.bing.com'>bar</a> some <br> text'
soup = BeautifulSoup(html, "html.parser")

tags = soup.find_all()
for tag in tags:
  if tag.has_attr('href'):
    html = html.replace(str(tag), tag['href'])

Unfortunatly this creates some issues:

  • the tags in the html use single quotes ', but beautifulsoup will create with str(tag) an tag with " quotes (<a href="www.google.com">foo</a>). So replace() will not find the match.
  • <br> get identified as <br/>. Again replace() will not find the match.

So it seems using python's replace() method will not give reliable results.

Is there a way to use beautifulsoup's methods to replace a tag with a string?

edit:

Added value for str(tag) = <a href="www.google.com">foo</a>

3
  • ‘ and “ are interchangeable (as long as you end the string with the same character) and both represent strings. Commented Sep 3, 2019 at 7:51
  • The issue is with quotes inside the string. Commented Sep 3, 2019 at 7:55
  • For me, BS just prints 'href' and not "href". Commented Sep 3, 2019 at 8:02

1 Answer 1

5

Relevant part of the docs: Modifying the tree

html="""
<html><head></head>
<body>
<a href="www.google.com">foo</a> some text 
<a href="www.bing.com">bar</a> some <br> text
</body></html>"""

from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
for a_tag in soup.find_all('a'):
    a_tag.string = a_tag.get('href')
print(soup)

output

<html><head></head>
<body>
<a href="www.google.com">www.google.com</a> some text 
<a href="www.bing.com">www.bing.com</a> some <br/> text
</body></html>
Sign up to request clarification or add additional context in comments.

1 Comment

Thats a good approach. It solved my issue. I only had to use soup.text in the end and get the required result.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.