0

I'm trying to build a simple webscraper. I want it to work by inputting a value into a webpage, clicking enter and then scraping the new page after the it loads.

So far initially loading up the webpage, inputting the value and clicking enter works but the driver does not seem to update when the new page loads, and as such I can scrape the new page for information.

Does anyone know how to get this functionality to work?

Code is below:

import selenium.webdriver 
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys

test_wage = float(53109.22)

options = selenium.webdriver.FirefoxOptions()
options.add_argument("--headless")
driver = selenium.webdriver.Firefox(options=options)
driver.get('https://www.thesalarycalculator.co.uk/salary.php')

takehome_form = driver.find_element(By.CLASS_NAME, "hero__input")
takehome_form.send_keys(test_wage)
takehome_form.send_keys(Keys.RETURN)

The code above works fine, it is the following where I have issue:

result = driver.find_element(By.XPATH, "/html/body/section[1]/div/table/tbody/tr[2]/td[6]")

Which produces the following error:

NoSuchElementException: Unable to locate element: /html/body/section[1]/div/table/tbody/tr[2]/td[6]; For documentation on this error, please visit: https://www.selenium.dev/documentation/webdriver/troubleshooting/errors#no-such-element-exception

Again, I think it is because the original webpage does not have this information, but the new webpage after clicking enter on the form and loading the new page does have this information, but the driver does not update and thinks the original webpage is open still.

Does anyone know how to fix this?

1
  • Selenium will automatically wait for a click that navigates via a traditional full page load. These days many sites will use Javascript to update the DOM instead of performing a full page load. That's what webdriverwaits are designed for: selenium.dev/documentation/webdriver/waits Commented Aug 6, 2024 at 16:31

2 Answers 2

3

Two suggestions:

  1. Utilize selenium's waits to effectively locate web elements
  2. Use relative XPaths over absolute XPath as the latter are more brittle and more likely to throw NoSuchElementException when the HTML gets updated

Check the working code below:

import selenium.webdriver 
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

test_wage = float(53109.22)

options = selenium.webdriver.FirefoxOptions()
options.add_argument("--headless")
driver = selenium.webdriver.Firefox(options=options)
driver.get('https://www.thesalarycalculator.co.uk/salary.php')
wait = WebDriverWait(driver, 10)

takehome_form = wait.until(EC.element_to_be_clickable((By.CLASS_NAME, "hero__input")))
takehome_form.send_keys(test_wage)
takehome_form.send_keys(Keys.RETURN)

result = wait.until(EC.visibility_of_element_located((By.XPATH, "(//td[contains(@class,'takehome')])[2]")))
print(result.text)

Result:

£ 3,446.73

Process finished with exit code 0
Sign up to request clarification or add additional context in comments.

Comments

1

It might be because of the loading time of the web page. You should add this :

WebDriverWait(driver, 10).until( EC.presence_of_element_located(By.XPATH, "/html/body/section[1]/div/table/tbody/tr[2]/td[6]"))

or just WebDriverWait(driver, 10)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.