0

I'm a little bit new to Python and for one of my research projects I needed a web scraper to scrape web content to create a dataset.

Since most of the threads suggested beautifulsoup package I tried building a web scraper based on Python.

Data I need to scrape is loaded after clicking a button on the web page.

Here's an Example:

http://www.engadget.com/products/apple/iphone/6/

Example

When clicked on "12 Comments" A popup loads and comments are displayed. I need to scrape those comments.

I tried many ways but nothing seem to work so far. Can someone look into my code if there's anything to be done or suggest me another way of doing it?

import bs4
import requests
session = requests.Session()
url = "http://www.engadget.com/products/apple/iphone/6/" 
page  = session.get(url).text
soup = bs4.BeautifulSoup(page, "html5lib")
engadgetul = soup.find("ul", class_= "product-criteria-bars")
engadgetdiv = engadgetul.find_all("div", class_="product-criteria-label")
for engadgetrv in engadgetdiv:
  review = engadgetrv.find_all("p", "comment-text")
for rr in review:
  print(rr.span.string)

1 Answer 1

1

When you click those links, the comments are loaded dynamically with Javascript. You can see the requests that are made to the server using the developer tools on your browser (F12 for Chrome) and going in the Network tab.

Use those URLs instead:

http://www.engadget.com/a/hovercard_criteria_comments/?product_id=44337&criteria_id=1

http://www.engadget.com/a/hovercard_criteria_comments/?product_id=44337&criteria_id=2

(and so on for different criteria_id)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.