5

I'm having difficulty getting the proper syntax to extract the value of an attribute in Beautifulsoup with HTML 5.0.

So I've isolated the occurrence of a tag in my soup using the proper syntax where there is an HTML 5 issue:

tags = soup.find_all(attrs={"data-topic":"recUpgrade"})

Taking just tags[1]:

date = tags[1].find(attrs={"data-datenews":True})

and date here is:

<span class="invisible" data-datenews="2018-05-25 06:02:19" data-idnews="2736625" id="horaCompleta"></span>

But now I want to extract the date time "2018-05-25 06:02:19". Can't get the syntax.

Insight/help please.

1
  • You can get attribute value from element, check out this Commented May 25, 2018 at 16:25

1 Answer 1

7

You can access the attrs using key-value pair

Ex:

from bs4 import BeautifulSoup
s = """<span class="invisible" data-datenews="2018-05-25 06:02:19" data-idnews="2736625" id="horaCompleta"></span>"""
soup = BeautifulSoup(s, "html.parser")
print(soup.span["data-datenews"])

Output:

2018-05-25 06:02:19
Sign up to request clarification or add additional context in comments.

3 Comments

Interesting. So parse again? (I used the parser, not shown, to get `soup')
so I try this syntax: print(date.span["data-datenews"]) and I get TypeError: 'NoneType' object has no attribute '__getitem__'. print(date["data-datenews"]) works. Why?
you need to find the span tag first

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.