0
<div class="book-cover-image">
<img alt="NOT IN MY BACKYARD – Solid Waste Mgmt in Indian Cities" class="img-responsive" src="https://cdn.downtoearth.org.in/library/medium/2016-05-23/0.42611000_1463993925_book-cover.jpg" title="NOT IN MY BACKYARD – Solid Waste Mgmt in Indian Cities"/>
</div>

I need to extract this title value from all such div tags. What can be the best way to perform this operation. Please suggest.

I am trying to fetch the title of all the books mentioned on this page.

I have tried this so far:

import requests 
from bs4 import BeautifulSoup as bs


url1 ="https://www.downtoearth.org.in/books"
page1 = requests.get(url1, verify=False)

#print(page1.content)

soup1= bs(page1.content, 'html.parser')
class_names = soup1.find_all('div',{'class':'book-cover-image'} )

for class_name in class_names:
    title_text = class_name.text
    print(class_name)
    print(title_text)
5
  • 1
    Add an example input and your desired output from that. Commented Jun 30, 2019 at 13:12
  • <div class="book-cover-image"> <img alt="NOT IN MY BACKYARD – Solid Waste Mgmt in Indian Cities" class="img-responsive" src="cdn.downtoearth.org.in/library/medium/2016-05-23/…" title="NOT IN MY BACKYARD – Solid Waste Mgmt in Indian Cities"/> </div> Commented Jun 30, 2019 at 13:20
  • Output should be title: NOT IN MY BACKYARD – Solid Waste Mgmt in Indian Cities Commented Jun 30, 2019 at 13:20
  • What did you try so far? Commented Jun 30, 2019 at 13:23
  • url1 ="downtoearth.org.in/books" page1 = requests.get(url1, verify=False) #print(page1.content) soup1= bs(page1.content, 'html.parser') class_names = soup1.find_all('div',{'class':'book-cover-image'} ) for class_name in class_names: title_text = class_name.text print(class_name) #print(title_text) Commented Jun 30, 2019 at 13:24

2 Answers 2

2

To get all title attributes for the book covers, you can use CSS selector .book-cover-image img[title] (select all <img> tags with attribute title that are under tag with class book-cover-image):

import requests
from bs4 import BeautifulSoup

url = 'https://www.downtoearth.org.in/books'
soup = BeautifulSoup(requests.get(url).text, 'lxml')

for i, img in enumerate(soup.select('.book-cover-image img[title]'), 1):
    print('{:>4}\t{}'.format(i, img['title']))

Prints:

   1    State of India’s Environment 2019: In Figures (eBook)                           
   2    Victim Africa (eBook)                                                           
   3    Frames of change - Heartening tales that define new India                       
   4    STATE OF INDIA’S ENVIRONMENT 2019                                               
   5    State of India’s Environment In Figures 2018 (eBook)                            
   6    Getting to know about environment                                               
   7    CLIMATE CHANGE NOW - The Story of Carbon Colonisation                           
   8    Climate change - For the young and curious                                      
   9    Conflicts of Interest: My Journey through India’s Green Movement                
  10    Body Burden: Lifestyle Diseases                                                 
  11    STATE OF INDIA’S ENVIRONMENT 2018                                               
  12    DROUGHT BUT WHY? How India can fight the scourge by abandoning drought relief   
  13    SOE 2017 (Print version) and SOE 2017 in Figures (Digital version) combo offer  
  14    State of India's Environment 2017 In Figures (eBook)                            
  15    Environment Reader for Universities                                             
  16    Not in My Backyard  (Book & DVD combo offer)                                    
  17    The Crow, Honey Hunter and the Kitchen Garden                                   
  18    BIOSCOPE OF PIU & POM                                                           
  19    SOE 2017 and Food book combo offer                                              
  20    FIRST FOOD: Culture of Taste                                                    
  21    Annual State Of India’s Environment - SOE 2017                                  
  22    An 8-million-year-old mysterious date with monsoon  (e-book)                    
  23    Why I Should be Tolerant                                                        
  24    NOT IN MY BACKYARD – Solid Waste Mgmt in Indian Cities  
Sign up to request clarification or add additional context in comments.

2 Comments

Thank You. If possible, please explain last two line of code:
@AnchalSarraf soup.select('.book-cover-image img[title]') performs the CSS selector on the soup (as described in the answer) and print('{:>4}\t{}'.format(i, img['title'])) does basic string formating - e.g. {:>4} prints string that is adjusted 4 characters to the right
2

You can do with xpath like this.

import requests
from lxml import html

url1 ="https://www.downtoearth.org.in/books"
res = requests.get(url1, verify=False)
tree = html.fromstring(res.text)
d = tree.xpath("//div[@class='book-cover-image']//img/@title")
for title in d:
    print(title)

Output

State of India’s Environment 2019: In Figures (eBook)
Victim Africa (eBook)
Frames of change - Heartening tales that define new India
STATE OF INDIA’S ENVIRONMENT 2019
State of India’s Environment In Figures 2018 (eBook)
Getting to know about environment
CLIMATE CHANGE NOW - The Story of Carbon Colonisation
Climate change - For the young and curious
Conflicts of Interest: My Journey through India’s Green Movement
Body Burden: Lifestyle Diseases
STATE OF INDIA’S ENVIRONMENT 2018
DROUGHT BUT WHY? How India can fight the scourge by abandoning drought relief
SOE 2017 (Print version) and SOE 2017 in Figures (Digital version) combo offer
State of India's Environment 2017 In Figures (eBook)
Environment Reader for Universities
Not in My Backyard  (Book & DVD combo offer)
The Crow, Honey Hunter and the Kitchen Garden
BIOSCOPE OF PIU & POM
SOE 2017 and Food book combo offer
FIRST FOOD: Culture of Taste
Annual State Of India’s Environment - SOE 2017
An 8-million-year-old mysterious date with monsoon  (e-book) 
Why I Should be Tolerant
NOT IN MY BACKYARD – Solid Waste Mgmt in Indian Cities

1 Comment

Thank You Rahul, it is simple and easily understandable

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.