0

So that's how HTML looks:

<p class="details">
<span>detail1</span>
<span class="number">1</span>
<span>detail2</span>
<span>detail3</span>
</p>

I need to extract detail2 & detail3.

But with this piece of code I only get detail1.

info = data.find("p", class_ = "details").span.text

How do I extract the needed items?

Thanks in advance!

2 Answers 2

1

Select your elements more specific in your case all sibling <span> of <span> with class number:

soup.select('span.number ~ span')

Example

from bs4 import BeautifulSoup
html='''<p class="details">
<span>detail1</span>
<span class="number">1</span>
<span>detail2</span>
<span>detail3</span>
</p>'''
soup = BeautifulSoup(html)

[t.text for t in soup.select('span.number ~ span')]

Output

['detail2', 'detail3']
Sign up to request clarification or add additional context in comments.

Comments

0

You can find all <span>s and do normal indexing:

from bs4 import BeautifulSoup

html_doc = """\
<p class="details">
<span>detail1</span>
<span class="number">1</span>
<span>detail2</span>
<span>detail3</span>
</p>"""

soup = BeautifulSoup(html_doc, "html.parser")

spans = soup.find("p", class_="details").find_all("span")

for s in spans[-2:]:
    print(s.text)

Prints:

detail2
detail3

Or CSS selectors:

spans = soup.select(".details span:nth-last-of-type(-n+2)")

for s in spans:
    print(s.text)

Prints:

detail2
detail3

1 Comment

Thank you! I will proceed with this option. Have a great day ahead!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.