Skip to main content
Filter by
Sorted by
Tagged with
0 votes
2 answers
98 views

Here is the code I'm trying to run: import nodriver as uc async def main(): browser = await uc.start(headless=True) page = await browser.get('https://www.nowsecure.nl') if __name__ == '...
grasswistle's user avatar
-3 votes
1 answer
94 views

I have an issue in scraping schools data. I need their email and website URL. I tried a lot but it's returning empty results. What's the best way to do this? Here is the code: from selenium import ...
Omprakash S's user avatar
1 vote
1 answer
47 views

I am trying to download some data (in an efficient way), however I encountered an unexpected problem. Here is the code that works just fine: import requests import os from bs4 import BeautifulSoup ...
IPII's user avatar
  • 72
2 votes
1 answer
77 views

I have to scrape a website which returns a static PDF file. The only Python package that can access the document successfully is scrapling. However, the PDF file returned is not saved correctly in my ...
pythonuser43343's user avatar
0 votes
0 answers
46 views

I’m building a Telegram bot in Python that scrapes points of interest (POIs) from Google Maps using Selenium. The overall flow works, but as soon as I try to scroll through and click on the .hfpxzc ...
Vazgen Hakobjanyan's user avatar
0 votes
0 answers
38 views

I have a scraping script that navigates to a website that contains the widget I want to scrape. The driver clicks a button that should trigger a xhr request and has never failed to do so in my manual ...
Cole Kuhlers's user avatar
1 vote
0 answers
900 views

I'm trying to download YouTube videos using the yt-dlp Python library in a deployed web application. The same script works perfectly on my local machine, but in the production (live) environment, it ...
Jeba Angelline Mary M SNSIHUB's user avatar
-7 votes
1 answer
71 views

I am scraping an Amazone page using Python and saving the result into a csv file. This code is running well, but the problem is that I get some product names without the first word. So for example ...
ellie_in_wonderland's user avatar
0 votes
0 answers
36 views

I am trying to scrape from a very specific site that has two things that doesn't work as expected: reject the cookie banner using reject button enter a specific section to get a view about the ...
CoMpUtEr1941's user avatar
2 votes
1 answer
78 views

        I have a scrapy Crawlspider that parses reviews, using a scrapy-rotating-proxies.         But when I tried to connect to the site I got the 507 status code. In ...
CollonelDain's user avatar
2 votes
1 answer
164 views

In some webpages where there is a <canvas> element, when i tried every single method of making the browser bigger, the page bigger... I did found some methods that will make everything big, but ...
MOB's user avatar
  • 31
-2 votes
2 answers
80 views

I want to use the Beautifulsoup library to fetch the clinical Trials data ("mitochondrial diseases") for my research studies. Although they have an API, I want to use web scraping. URL = ...
Gautam Sharma's user avatar
1 vote
2 answers
194 views

I am trying to use selenium to click the Accept all or Reject all button on a cookie pop up for the the website autotrader.co.uk, but I cannot get it to make the pop up disappear for some reason. This ...
teeeeee's user avatar
  • 785
2 votes
1 answer
88 views

background: by default the website is only showing few names and there s a "moreBtn" to generate the full list code idea: create Html session, render with script clicking the "moreBtn&...
Beginner's user avatar
1 vote
1 answer
107 views

import zipfile import json import os from selenium import webdriver from selenium.webdriver.chrome.options import Options from selenium.webdriver.common.by import By import time def ...
fozan javaid's user avatar
-1 votes
2 answers
56 views

import requests from bs4 import BeautifulSoup url = "https://example-news-site.com" headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)" } response =...
LANAYA88's user avatar
1 vote
2 answers
87 views

I am trying to pull a geojson file from here. The JSON appears as expected when I paste that link into Chrome or Safari. However, I get the following error every time when I run the following code on ...
opposity's user avatar
  • 121
0 votes
0 answers
169 views

I have a simple yfinance program that is supposed to get financial statements for any company I pick. I run the code and it just returns an empty dataframe. My yfinance and everything else is up to ...
ridhamb's user avatar
  • 23
0 votes
0 answers
74 views

I'm trying to write a simple web scraper using the scraper crate to learn Rust, and I encountered a weird (for me) problem. My find_element function can't find <tr> elements unless they're ...
crotylaldehyde's user avatar
1 vote
2 answers
120 views

I wrote a python script for scraping data from WHO website, I wanted to retrieve Title, author name, date, pdf link and child page link from parent page (i applied some filters on parent page) I am ...
Mann Jain's user avatar
0 votes
0 answers
30 views

Some websites update JWT regulary to prevent scraping: in browser JS sends XHR to server to get fresh token- see the Token XHR on the picture below. Eg. curl "https://www.nemlig.com/webapi/Token&...
Igor Savinkin's user avatar
0 votes
0 answers
183 views

Here’s a brief overview of what I want to achieve Extract raw htmls and save them Use Crawl4AI to produce a ‘cleaner’ and smaller HTML that has a lot of information, including what I will eventually ...
Leksa99's user avatar
  • 115
2 votes
2 answers
154 views

I'm using the Scrapfly API to scrape a webpage using a GET request with a JavaScript scenario like this: "js_scenario": [ { "fill": { "selector": "form#...
Asjad Gohar's user avatar
-3 votes
1 answer
78 views

I'm using Python 3.12.3, Selenium 4.31.0, Firefox driver in Ubuntu 24.04. When I try to open an url, a cookie consent popup, asking to continue without accepting, accept and more options. How can I ...
Michael's user avatar
  • 117
-1 votes
2 answers
57 views

I'm using Python 3.12.3, Selenium 4.31.0, Firefox driver. How do I retrieve the class attribute in this html tag? <div class="costa-itinerary-tile FS03250425_BCN03A20" data-cc-cruise-id=&...
Michael's user avatar
  • 117
2 votes
1 answer
210 views

Is it possible to scrape the polygon data from this interactive map in R? 🔗 https://fogocruzado.org.br/mapadosgruposarmados The map shows territories controlled by armed groups across different years....
Maria Mittelbach's user avatar
0 votes
1 answer
102 views

I'm trying to do some scraping for educational purposes, I just started and am fairly noob at python. My problem is, in selenium I am trying to scrape a product page, take the name, price, shipping ...
Lahearle's user avatar
0 votes
2 answers
118 views

https://www.cruisetimetables.com/invergordon-scotland-cruise-ship-schedule-2025.html So so far I have: scrape1 <- read_html('https://www.cruisetimetables.com/invergordon-scotland-cruise-ship-...
dreddlord's user avatar
0 votes
1 answer
42 views

I want to access the tables of the following website: https://www.marketbeat.com/ratings/ However, pages can only be changed by setting the "Reporting Date". I do know that I can change the ...
AmyTheGhostHunter's user avatar
0 votes
0 answers
155 views

I would like to extract data from a few tables from the following website: https://www2.bmf.com.br/pages/portal/bmfbovespa/boletim1/SistemaPregao1.asp?pagetype=pop&caminho=Resumo%20Estat%EDstico%...
Osvaldo Assunção's user avatar
0 votes
1 answer
40 views

Context and Problem: I'm developing a service that consumes an API hosted on port 448. The API is protected by Azure's WAF V2, which limits requests: it allows only 150 consecutive requests, after ...
Alejandro Echeverria's user avatar
0 votes
0 answers
132 views

When basically requesting a proxy, what happens is that it delivers an http that currently seems to me to be unusable because the vast majority of sites use https and this causes the request to be ...
Digital Farmer's user avatar
0 votes
0 answers
88 views

I'm writing a Python Selenium scraper for a web page that uses infinite scrolling to load content dynamically. Over time, as more posts are loaded, the JavaScript heap memory usage in ChromeDriver ...
mohammad jcm's user avatar
-1 votes
1 answer
59 views

Trying to pull data from FAA N Number results but request.get() doesn't seem to be working. I followed this tutorial (https://www.youtube.com/watch?v=QhD015WUMxE) and was able to scrape the website he ...
Emily Stauf's user avatar
0 votes
1 answer
74 views

I have a Python webscraper that pulls a specific value every 1 second. The target website it AJAXed, so I'm not hitting it with too many requests. This is the Python code: import time import logging ...
Matteo's user avatar
  • 1,136
0 votes
1 answer
78 views

I have this issue with node-fetch which returns error: ERR FetchError { message: 'maximum redirect reached at: https://www.alsa.es/o/Alsa-main-theme/images/web2020/logo-alsa.svg', type: '...
dalvi's user avatar
  • 57
1 vote
0 answers
65 views

This code is supposed to download some documents it most locate within series of given links. While is does seemingly locate the link of the pdf file, its failing to download it. What might be the ...
42WaysToAnswerThat's user avatar
0 votes
1 answer
51 views

Automating Amazon UK with Selenium: Handling CAPTCHA, Setting Postcode, and Extracting Product Data I'm automating Amazon UK (www.amazon.co.uk) using Selenium to: Decline cookies (if present). Click ...
Luis swift's user avatar
0 votes
3 answers
95 views

From below webpages I like to extract data: https://www.ams.usda.gov/services/enforcement/organic/settlements https://www.ams.usda.gov/services/enforcement/organic/settlements-2023 "03/19/2025&...
Anjali Kushwaha's user avatar
1 vote
1 answer
26 views

I'm new to Selenium and trying to undertake a live example of web-scraping a list using the following URL - https://mcscertified.com/find-an-installer/ However, I'm struggling to click on a drop-drown ...
Lee Murray's user avatar
0 votes
1 answer
51 views

SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1018) This is my code: import requests s = requests....
Identicon's user avatar
  • 445
-1 votes
2 answers
65 views

import requests from bs4 import BeautifulSoup url = 'https://www.tori.fi/recommerce/forsale/item/22362242' headers = {"User-Agent": "Mozilla/5.0"} response = requests.get(url, ...
Dotres's user avatar
  • 9
-1 votes
1 answer
139 views

I am not well versed in using advanced datascraping and coding. However, there is a webpage ("https://www.ksestocks.com/HistoryHighLow") which carries a dropdown menu with various options. I ...
Huma Wakani's user avatar
1 vote
1 answer
579 views

I am creating a flow in Microsoft Power Automate Desktop that retrieves the weather information for a user-specified city. The flow prompts the user to enter a city, then opens Google Search to get ...
Lucas Rijnberk's user avatar
0 votes
0 answers
143 views

I have recently switched from selenium to nodriver for speed and stealth reasons. I am having trouble accessing elements inside an iframe even though material on this site and elsewhere says that '...
Stephen Smith's user avatar
1 vote
1 answer
632 views

I am currently trying to take a screenshot of a given web page using Crawl4ai, however each time that I try to do it I get an error or I don't get anything. Here is the code I used that is the same ...
Bernardo's user avatar
1 vote
1 answer
633 views

I'm trying to use an llm ollama in python (vscode) to scrape data from a website. But whenever I run the code it gives an error: ERROR [browser] Failed to initialize Playwright browser: BrowserType....
Mohammad Abdullah's user avatar
0 votes
0 answers
67 views

I am trying to get data from the web. Unfortunately, I get this error. I suppose that it identifies a big file using a search and then tries to have a conversation about that entire file. Is there any ...
Karel Macek's user avatar
  • 1,189
-1 votes
1 answer
29 views

I am interested in a site that when you enter a login, it goes to a password entry page. How can I enter data in such a case?
semorka's user avatar
0 votes
3 answers
109 views

this is the code i wrote to generate the data: info <- html_nodes(manga, ".mt4") %>% html_text2() %>% strsplit("\n") it returns 50 rows of lists that that look like this: [...
doot's user avatar
  • 1