Skip to main content
Filter by
Sorted by
Tagged with
0 votes
1 answer
35 views

I am scraping the Dead by Daylight Fandom wiki (specifically TOME pages, e.g., https://deadbydaylight.fandom.com/wiki/Tome_1_-_Awakening) to extract memory logs. The goal is to extract the Memory ...
zeromiedo's user avatar
-2 votes
0 answers
51 views

from playwright.sync_api import sync_playwright profile_path = r"C:\Users\kdutt\AppData\Roaming\Mozilla\Firefox\Profiles\p283dicx.default-release" firefox_path = r"C:\Program Files\...
Krishnendu Dutta's user avatar
3 votes
1 answer
40 views

I am trying to search for elements on a webpage and have used various methods, including text and XPath. It seems that the timeout option does not work the way I expected, and no exception is raised ...
Shankboy's user avatar
0 votes
2 answers
202 views

From the below structure I only want value of href attribute. But rec_block is returning h5 element without its children so basically <h5 class="series">Recommendations</h5>. <...
Emby's user avatar
  • 1
0 votes
0 answers
46 views

I previously extracted the US fuel surcharge history using this JSON endpoint: https://www.ups.com/assets/resources/fuel-surcharge/us.json But, it stopped updating data after 9/22/2025. How can I ...
maxi's user avatar
  • 1
0 votes
0 answers
77 views

I have a bit of code I am trying to build to take a specific tumblr page and then iteratively scan by post # sequentially and check to see if a page exists. If it does it will print that full URL to ...
Kyle Campbell's user avatar
2 votes
0 answers
99 views

I'm making a tutorial on how to scrape with Scrapy. For that, I use Quarto/RStudio and the website https://quotes.toscrape.com/. For pedagogic purposes, I need to run a first crawl on the first page, ...
Didier mac cormick's user avatar
Advice
0 votes
4 replies
43 views

i wanted to know how i can get live news feed data (INDIAN) , without any or like minimal latency(30-40s), i tried using some rss feeds but all they do is provide the data as some latency so what i ...
its m's user avatar
  • 49
0 votes
0 answers
48 views

Camoufox browser window remains visible in WSL even when headless is set to virtual Description When headless is set to "virtual", the Camoufox browser window still appears on the screen in ...
exlead's user avatar
  • 1
1 vote
0 answers
86 views

I want to retrieve content from web page. However, I tried above method but the error still come when the query string contain Chinese character. code $json = Get-Content -Encoding utf8 -Path "./...
Akira's user avatar
  • 33
-4 votes
2 answers
75 views

I am trying to write code to give me BBFC film ratings. I am using selenium to do this but would be happy with any solution that works reliably. After a lot of work I finally came up with this code: #...
Simd's user avatar
  • 21.5k
0 votes
1 answer
211 views

This is my python code using on ubuntu to try fetch and extract data from https://www.sofascore.com/ I create this test code before using on E2 device in my plugin # python3 -m venv venv # source venv/...
RR-EB's user avatar
  • 55
0 votes
1 answer
72 views

I'm working on integration tests for a web application that's running in a Docker container within our GitLab CI/CD pipeline. The application is a frontend that requires Kerberos/SPNEGO authentication ...
ben green's user avatar
0 votes
1 answer
65 views

I'm quite new to web scraping, and in particular in using Scrapy's spiders, pipelines... I'm getting some 202 status from some spider requests' response, hence the page content is not available yet ...
Manu310's user avatar
  • 178
-1 votes
1 answer
47 views

I have this Apps Script / Cheerio function that successfully scrapes the data I want from the url. The site only displays 25 entries at this url. I can find additional entries on subsequent pages (by ...
zambonidude's user avatar
1 vote
0 answers
40 views

Problem I’m using Docusaurus with Typesense and the docsearch-typesense-scraper to index my documentation site. Everything runs fine — the sitemap is found, and the scraper produces records. However, ...
Erwin's user avatar
  • 11
0 votes
0 answers
149 views

I’m building a scraper to monitor the Meta (Facebook) Ads Library for new ads as soon as they start running. From inspecting network requests, I see that the Ads Library web app uses a GraphQL ...
kiqueboat's user avatar
2 votes
1 answer
43 views

Website photo with search box visible. So, this is the website https://sa.ucla.edu/ro/public/soc There is a dropdown menu for selecting subject area where I need to write subject and i will receive ...
Rohit Kasturi's user avatar
0 votes
0 answers
126 views

I'm trying to download the barchart data table from https://www.barchart.com/investing-ideas/ai-stocks using Excel VBA in similar manner as the python script in Automatic file downloading on Barchart....
ateene's user avatar
  • 1
-1 votes
1 answer
67 views

I’m using Python + Selenium + ChromeDriver to check a list of titles (from a CSV file) against an online library catalog. My script searches each title and tries to determine if a specific library has ...
huda's user avatar
  • 1
3 votes
2 answers
155 views

I am trying to read in a specific table from the US Customs and Border Protection's Dashboard on Southwest Land Border Encounters as a dataframe. The url is: https://www.cbp.gov/newsroom/stats/...
Ari's user avatar
  • 2,023
-2 votes
1 answer
117 views

I am webscraping WHO pages using the following code: pacman::p_load(rvest, httr, stringr, purrr) download_first_pdf_from_handle <- function(handle_id) { ...
flâneur's user avatar
  • 321
0 votes
0 answers
125 views

I’m working on a project in Laravel/Python where I want to fetch product information from Shein, but I’ve run into a major problem with ShareJump links. Here’s an example link I’m working with: http://...
Mahmod Algeriany's user avatar
1 vote
1 answer
126 views

I am a bit new to webscraping and trying to build a scraper to collect the title, text, and date from this archived page: from selenium import webdriver from selenium.webdriver.chrome.service import ...
Kaitlin's user avatar
  • 83
1 vote
2 answers
88 views

I'm using selenium in python and trying to click the "See all Properties" button to get to the next web page where all the properties will be listed and I can easily scrap the data. Here's ...
Gurnoor Kalsi's user avatar
0 votes
0 answers
281 views

My goal is to find out if a given user has liked any post of another profile. So the following question has to be answered: Has the user X liked any post on the profile Y in the past 24 months. For ...
a6i09per5f's user avatar
-1 votes
2 answers
104 views

I'm trying to scrape the data off this site. The website shows a charging station, in this case you can click each to unravel the accordion and see the data per charger. I am trying to use this ...
NorthoftheWall's user avatar
3 votes
1 answer
157 views

I'm working on a web scraping project in Python to collect data from a real estate website. I'm running into an issue with the addresses, as they are not always consistent. I've already handled simple ...
Adamzam15's user avatar
1 vote
0 answers
68 views

ody: I’m building a Make.com scenario like this: HTTP (fetch website HTML) → Text parser (extract elements) → Filter "only good links" → Array aggregator → further processing Goal I want ...
Alex Lombardo's user avatar
-1 votes
3 answers
245 views

I would like to scrape the 2nd table in the page seen below from the link - https://fbref.com/en/comps/9/2023-2024/stats/2023-2024-Premier-League-Stats on google collab. But pd.read_html only gives me ...
rian patel's user avatar
-1 votes
1 answer
113 views

I'm using Puppeteer and JS to write a web scraper. The site I'm scraping is pretty intense, so I need to use a local chrome instance and a residential proxy service to get it working. Here's my basic ...
Alex's user avatar
  • 41
2 votes
2 answers
189 views

Using the following code: library(rvest) read_html("https://gainblers.com/mx/quinielas/progol-revancha/", encoding = "UTF-8")|> html_elements(xpath= '//*[@id="...
Alejandro Carrera's user avatar
1 vote
2 answers
266 views

I am trying to extract bus prices between 2 cities in Ontario, Canada. I am using Selenium/Python to do this: The website is here and it has default cities and dates. Here is my Python code: from ...
brooklin7's user avatar
1 vote
2 answers
111 views

I'm a bit new to Selenium and am trying to build a webscraper that can select a dropdown menu and then select specific options from the menu. I've built the following code and it was working at one ...
Kaitlin's user avatar
  • 83
3 votes
1 answer
61 views

I'm rather new to using Beautiful Soup and I'm having some issues splitting some html correctly by only looking at html breaks and ignoring other html elements such as changes in font color etc. The ...
James Brian's user avatar
1 vote
1 answer
234 views

I’ve been trying to scrape lottery results from a website that shows draws. The data is presented in a results table, but I keep running into strange issues where sometimes the numbers are captured ...
Zuryab's user avatar
  • 11
2 votes
1 answer
201 views

I'm trying to extract tables from this site: https://www.dnb.com/business-directory/company-information.beverage_manufacturing.br.html As you can see, the complete table has 14,387 rows and each page ...
Alejandro Carrera's user avatar
0 votes
0 answers
64 views

I'm trying to extract data from a website using Selenium. On random occasions, the page will do a client-side redirect with window.location. How can I disable this? I've tried redefining the property ...
anon's user avatar
  • 697
1 vote
1 answer
260 views

I set up a self-hosted Firecrawl instance and I want to crawl my internal intranet site (e.g. https://intranet.xxx.gov.tr/). I can access the site directly both from the host machine and from inside ...
birdalugur's user avatar
0 votes
1 answer
112 views

on this page I want to parse few elements. I would like to get text in circles and use attribute value to click sometimes. That code returns anything. With this code I want to get all attribute ...
Rok Golob's user avatar
2 votes
1 answer
118 views

This is my code as of now: from selenium import webdriver from webdriver_manager.chrome import ChromeDriverManager from selenium.webdriver.chrome.service import Service options = webdriver....
Ahmad's user avatar
  • 139
0 votes
1 answer
216 views

I am trying to use pytube (v15.0.0) to fetch the titles of YouTube videos. However, for every video I try, my script fails with the same error: HTTP Error 400: Bad Request. I have already updated ...
Rohit Hake's user avatar
0 votes
0 answers
205 views

I have a node Scraper Which Scrapes the HLS streaming url using Playwright Browser which gives the master Playlist like: https://example.com/master.m3u8 Then that Master Playlist does have a cors ...
Alsiro Mira's user avatar
1 vote
2 answers
294 views

I'm trying to download a protected PDF from the New York State Courts NYSCEF website using Python. The URL looks like this: https://iapps.courts.state.ny.us/nyscef/ViewDocument?docIndex=...
Daremitsu's user avatar
  • 655
4 votes
2 answers
285 views

I’m trying to programmatically download the full “pubblicazione completa non certificata” PDFs of the Italian Gazzetta Ufficiale – Serie Generale for 1969 (for an academic article). The site has a ...
Mark's user avatar
  • 1,801
-2 votes
2 answers
153 views

I am using the following code. It successfully targets the correct url and node text. However, the data that is returned is incomplete as some of the fields (like previous close and open) are blank or ...
Brad Horn's user avatar
  • 685
0 votes
0 answers
69 views

How can I use ScrapingRobot’s API to scrape Google search results as structured JSON data (e.g., titles, URLs, snippets) instead of raw HTML? The main page of the website shows three types of "...
AtiehCodes's user avatar
0 votes
2 answers
171 views

I would like to scrape the problems from these Go (board game) books, and convert them into SGFs, if they aren't in that format already. For now, I would be satisfied with only taking the problems ...
psygo's user avatar
  • 7,853
3 votes
2 answers
382 views

I'm trying to create a script in Python to scrape all available titles that show up when clicking on the black-colored area on the map in this website. For example, when I click on a certain area on ...
MITHU's user avatar
  • 166
1 vote
0 answers
47 views

I'm trying to scrape a product page from Digikala using Pyppeteer because the site is heavily JavaScript-rendered. Here is my render class: import asyncio from pyppeteer import launch from pyppeteer....
Ali Motamed's user avatar

1
2 3 4 5
1035