Why is this web scrape not working on python?

Question

I haven’t recently been using the code attached. For the past few weeks, it has been working completely fine and always produced results. However, I used this today and for some reason it didn’t work. Could you please help and provide a solution to the problem.

import requests, json
from bs4 import BeautifulSoup

headers = {
    "User-Agent":
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}

params = {"q": "dji", "hl": "en", 'gl': 'us', 'tbm': 'shop'}

response = requests.get("https://www.google.com/search",
                        params=params,
                        headers=headers)
soup = BeautifulSoup(response.text, 'lxml')
# list with two dict() combined
shopping_data = []
shopping_results_dict = {}


for shopping_result in soup.select('.sh-dgr__content'):
    title = shopping_result.select_one('.Lq5OHe.eaGTj h4').text
    product_link = f"https://www.google.com{shopping_result.select_one('.Lq5OHe.eaGTj')['href']}"
    source = shopping_result.select_one('.IuHnof').text
    price = shopping_result.select_one('span.kHxwFf span').text

    try:
        rating = shopping_result.select_one('.Rsc7Yb').text
    except:
        rating = None

    try:
        reviews = shopping_result.select_one('.Rsc7Yb').next_sibling.next_sibling
    except:
        reviews = None

    try:
        delivery = shopping_result.select_one('.vEjMR').text
    except:
        delivery = None



    shopping_results_dict.update({
        'shopping_results': [{
            'title': title,
            'link': product_link,
            'source': source,
            'price': price,
            'rating': rating,
            'reviews': reviews,
            'delivery': delivery,
        }]
    })

    shopping_data.append(dict(shopping_results_dict))

print(title)

S.B · Accepted Answer · 2022-02-08 17:51:03Z

2

Because .select in for shopping_result in soup.select('.sh-dgr__content'): could not find any element so it gives you an empty list. Therefor the body of the for-loop is not executed. Python jumps out of the loop.

title only exists and is defined when the body of the for loop executes.

You should make sure you used a correct method to find your element(s).

edited Feb 8, 2022 at 17:51

answered Feb 8, 2022 at 17:43

S.B

17k12 gold badges38 silver badges73 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Harrison Cox Over a year ago

Oh okay, I figured the for loop wasn’t looping. Do you know anyway in which I could make the for loop work or make it find elements?

S.B Over a year ago

@HarrisonCox Apparently you need another way to find your element... I didn't see the html of the page, but for CSS classes, make sure you spell it correct. You could use XPATH which gives you more flexibility. Also I should mention that if the element has ID tag, it should be your first priority.

Harrison Cox Over a year ago

I’ve looked through the HTML of the website that I’m trying to access and still don’t understand the problem. Sorry to be a pain but I’ve attached a link to a website that I used to help me code this in the first place. If you have time, could you look through it and see if you can find a fix. Many thanks. dev.to/dmitryzub/scrape-google-shopping-with-python-49ad

S.B Over a year ago

@HarrisonCox I can only show you the path. with print(response.request.url) you can see which url you are requesting to. Then try to write the html to a file, and then try to find your element inside that file. If you inspect the created html file, you will get your answer why your code doesn't work... Many website load their content dynamically so Beautiful Soup can help you with, it doesn't render JS. or some websites constantly update the name of their tags (or maybe structures) to make things harder for scrapers...

S.B Over a year ago

@HarrisonCox It it loads dynamically use Selenium, if it changes the structure, use the location of elements instead of tag names.

|

Collectives™ on Stack Overflow

Why is this web scrape not working on python?

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related