2

I'm beginner in Python Webscriping using beautifulsoup. I was trying to scrape one real estate website using beautifulsoup but there is row with different information in each column. However each column's class name is same so When I trying to scrape information of each column, I got a same result becuase of same class name.

Link of the website I was trying to scrape.

Code From The HTML

<div class="lst-middle-section resale">
<div class="item-datapoint va-middle">
    <div class="lst-sub-title stub text-ellipsis">Built Up Area</div>
    <div class="lst-sub-value stub text-ellipsis">2294 sq.ft.</div>
</div>
<div class="item-datapoint va-middle">
    <div class="lst-sub-title stub text-ellipsis">Avg. Price</div>
    <div class="lst-sub-value stub text-ellipsis"><i class="icon-rupee"></i> 6.5k / sq.ft.</div>
</div>
<div class="item-datapoint va-middle">
    <div class="lst-sub-title stub text-ellipsis">Possession Date</div>
    <div class="lst-sub-value stub text-ellipsis">31st Dec, 2020</div>
</div>

Code I Tried!

for item in all:
try:
    print(item.find('span', {'class': 'lst-price'}).getText())
    print(item.find('div',{'class': 'lst-heading'}).getText())
    print(item.find('div', {'class': 'item-datapoint va-middle'}).getText())
    print('')
except AttributeError:
    pass

If I use class 'item-datapoint va-middle' again then it shows sq.ft area not avg.price or Possession date.

Solution? TIA!

2
  • Can you show us the code that you tried? Commented Nov 1, 2019 at 7:47
  • I edited a code. Commented Nov 1, 2019 at 12:14

1 Answer 1

2

Use find_elements_by_class_name instead of find_element_by_class_name.

find_elements_by_class_name("item-datapoint.va-middle")

You will get a list of elements.

Selenium docs: Locating Elements

Edit:

from selenium import webdriver

url = 'https://housing.com/in/buy/search?f=eyJiYXNlIjpbeyJ0eXBlIjoiUE9MWSIsInV1aWQiOiJhMWE1MjFmYjUzNDdjYT' \
      'AxNWZlNyIsImxhYmVsIjoiQWhtZWRhYmFkIn1dLCJub25CYXNlQ291bnQiOjAsImV4cGVjdGVkUXVlcnkiOiIlMjBBaG1lZGFiYWQiL' \
      'CJxdWVyeSI6IiBBaG1lZGFiYWQiLCJ2IjoyLCJzIjoiZCJ9'

driver = webdriver.Chrome()
driver.get(url)
fields = driver.find_elements_by_class_name("item-datapoint.va-middle")
for i, field in enumerate(fields):
    print(i, field.text)
driver.quit()

Now you see the index in the list (fields) for every element.

Print the elements you want like here:

poss_date = fields[2].text
Sign up to request clarification or add additional context in comments.

2 Comments

Type Error: NoneType Object is not callable.
see edit for fix, replace spaces with dots in class name

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.