0

I am new to selenium in python (and all web-interface applications of python) and I have a task to complete for my present internship.

My script successfully navigates to an online database and inputs information from my data tables, but then the webpage in question takes anywhere from 30 seconds to several minutes to compute an output.

How do I go about instructing python to re-check the page every 30 seconds until the output appears so that I can parse it for the data I need? For instance, which functions might be I start with?

This will be part of a loop repeated for over 200 entries, and hundreds more if I am successful so it is worth my time to automate it.

Thanks

2
  • 1
    selenium-python.readthedocs.io/waits.html ? Commented Jul 26, 2018 at 20:13
  • is selenium the right tool here? if the web page in question provides an API, it might be better for you to write a script that will create multiple threads, submit your requests, then query for the results directly through the API. Just a thought. Commented Jul 26, 2018 at 20:38

2 Answers 2

1

You should use Seleniums Waits as pointed by G_M and Sam Holloway. One which I most use is the expected_conditions:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Firefox()
driver.get("http://somedomain/url_that_delays_loading")
try:
    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, "myDynamicElement"))
    )
finally:
    driver.quit()

It will wait until there is an element with id "myDynamicElement" and then execute the try block, which should contain the rest of your work. I prefer to use the the class By.XPATH, but if you use By.XPATH with the method presence_of_element_located add another () so it will be the required tuple as noted in this answer:

from selenium.webdriver.common.by import By

driver.find_element(By.XPATH, '//button[contains(text(),"Some text")]')
driver.find_element(By.XPATH, '//div[@id="id1"]')
driver.find_elements(By.XPATH, '//a')

The easiest way to find (for me) the XPATH of an element is going to the developer mode in chrome (F12), pressing ctrl+F, and using the mouse with inspect, trying to compose the proper XPATH, which will be specific enough to find just the expected element, or the least number of elements as possible.

All the examples are from (or based) the great selenium documentation.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, before reading this post I did in fact try Selenium Waits and it works very well, although my code is probably not as robust as this and finds an element by ID instead of XPATH.
The code is from the official documentation, the second block it was a small adaptation, but it is good to use try, except and finally when working in a unstable task as web scraping. Also, you should accept the best answer for your question.
1

If you just want to space out checks, the time.sleep() function should work.

However, as G_M's comment says, you should look into Selenium waits. Think about this: is there an element on the page that will indicate that the result is loaded? If so, use a Selenium wait on that element to make sure your program is only pausing until the result is loaded and not wasting any time afterwards.

1 Comment

Terrific, I will look into those functions and see if I can get them to work

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.