1

I want to extract data into div tags using BeautifulSoup :

<div class="post contentTemplate" itemprop="text">Data to extract<div class="clear"></div></div>

2 Answers 2

1

You can try something like this:

from bs4 import BeautifulSoup as bs

data = '<div class="post contentTemplate" itemprop="text">Data to extract<div class="clear"></div></div>'
soup = bs(data)
m = soup.findAll("div", {"class": "post contentTemplate"})
for k in m:
    print(k.get_text())

Output:

Data to extract
Sign up to request clarification or add additional context in comments.

Comments

0

you can use the get_text() method. this will extract all text from every div that find_all() finds in the source code.

data = [e.get_text() for e in html.find_all('div')]

when run it returns:

[u'Data to extract', u'']

if you don't want the empty values just filter them out.

data = [e.get_text() for e in html.find_all('div') if e.get_text()]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.