Scraping all elements of a webpage

Question

I am trying to scrape all elements(Images, Graphs, Hyper Links) of this website. But, unfortunately, the images, graphs, and hyperlinks are not scraping properly. I tried using bs4.

import requests
from bs4 import BeautifulSoup

url ='https://www.gold.org/goldhub/research/investment-update-case-gold-uk-defined-benefit-schemes'

page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
lay_content = soup.find("div", {"class": "layout-content"})

In this way, the extracted contents are not getting all things. How can I do this? Thanks for reading.

There is a chance that a lot of these webpage elements are rendered a few seconds after the initial page load using JavaScript. Unfortunately the Python requests library only loads the static HTML, it doesn't wait for the JavaScript to run and load the data. I would recommend using the Python Selenium library instead of requests as it has the ability to use a browser engine to run the JavaScript and wait for the data to load. — Tom
– Tom, Commented Dec 13, 2022 at 14:35
@Tom I tried using selenium in Kaggle it doesn't work. It is easily possible using selenium in computer browser, but in my PC that is not possible due to memory issues. I am looking if there is any alternative solution. — Phi
– Phi, Commented Dec 13, 2022 at 14:41

Prophet · Accepted Answer · 2022-12-13 14:53:07Z

1

You can use find_all() method.
When no arguments passed it will return all the elements.
So, this should give you all the elements on the page.

import requests
from bs4 import BeautifulSoup

url ='https://www.gold.org/goldhub/research/investment-update-case-gold-uk-defined-benefit-schemes'

page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
lay_content = soup.find_all()

answered Dec 13, 2022 at 14:53

Prophet

33.5k28 gold badges58 silver badges90 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Phi Over a year ago

Thanks a lot for your effort. But In this way graphs and hyperlinks are not scrapped.

Prophet Over a year ago

Maybe because these elements created later by some scripts? That's why Selenium may be better for such issues.

Collectives™ on Stack Overflow

Scraping all elements of a webpage

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related