Python, can not parse pintrest [duplicate]

Question

I need to parse Pinterest, but for some reason, instead of links to pictures, incomprehensible and non-working links appear.

def parse():
    url = 'https://www.pinterest.ie/'
    r = requests.get(url)
    soup = BeautifulSoup(r.text,'lxml')
    print(soup.find_all('a'))
parse()

Have you LOOKED at the source code for that page, using View Source or by printing out r.text? The HTML you fetch contains little more than ads. The page is built dynamically with Javascript. You'd need to use something like Selenium to get a real browser involved. — Tim Roberts
– Tim Roberts, Commented Sep 3, 2022 at 6:19

M B · Accepted Answer · 2022-09-03 06:20:13Z

0

The site requires JavaScript to be active, which isn't the case when you send a request through BeautifulSoup. A workaround has been suggested here, where you can use Selenium to open up the page in an actual browser (thereby enabling JavaScript), and then use BeautifulSoup to parse the HTML.

Something like this should work:

from bs4 import BeautifulSoup
import selenium.webdriver.chrome.service as service
from selenium import webdriver

service = service.Service("../chromedriver.exe")
service.start()
driver = webdriver.Remote(service.service_url)

def parse():
    url = 'https://www.pinterest.ie/'
    driver.get(url)
    html = driver.page_source
    soup = BeautifulSoup(html, 'lxml')
    print(soup.find_all('a'))

parse()

You will, of course, need some idea of how to use Selenium. The official docs should help.

answered Sep 3, 2022 at 6:20

M B

3,4932 gold badges17 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python, can not parse pintrest [duplicate]

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related