0

I have a webscraper that is running on my system and I wanted to migrate it over to PythonAnywhere, but when I moved it now it doesn't work.

Exactly the sendkeys does not seem to work - after the following code is executed I never move on to the next webpage so an attribute error gets tripped.

My code:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import csv
import time



# Lists for functions
parcel_link =[]
token = []
csv_output = [ ]

# main scraping function
def getLinks(link):


# Open web browser and get url - 3 second time delay.

#Open web browser and get url - 3 second time delay.
driver.get(link)
time.sleep(3)

inputElement = driver.find_element_by_id("mSearchControl_mParcelID")
inputElement.send_keys(parcel_code+"*/n")
print("ENTER hit")

pageSource = driver.page_source
bsObj = BeautifulSoup(pageSource)
parcel_link.clear()
print(bsObj)
#pause = WebDriverWait(driver, 60).until(EC.presence_of_element_located((By.ID, "mResultscontrol_mGrid_RealDataGrid")))

for link in bsObj.find(id="mResultscontrol_mGrid_RealDataGrid").findAll('a'):
    parcel_link.append(link.text)
print(parcel_link)

for test in parcel_link:
    clickable = driver.find_element_by_link_text(test)
    clickable.click()
    time.sleep(2)

The link I am trying to operate is: https://ascendweb.jacksongov.org/ascend/%280yzb2gusuzb0kyvjwniv3255%29/search.aspx

and I am trying to send: 15-100*

TraceBack:

03:12 ~/Tax_Scrape $ xvfb-run python3.4 Jackson_Parcel_script.py
Traceback (most recent call last):
  File "Jackson_Parcel_script.py", line 377, in <module>
    getLinks("https://ascendweb.jacksongov.org/ascend/%28biohwjq5iibvvkisd1kjmm45%29/result.aspx")
  File "Jackson_Parcel_script.py", line 29, in getLinks
    inputElement = driver.find_element_by_id("mSearchControl_mParcelID")
  File "/usr/local/lib/python3.4/dist-packages/selenium/webdriver/remote/webdriver.py", line 206, in find_element_by_id
    return self.find_element(by=By.ID, value=id_)
  File "/usr/local/lib/python3.4/dist-packages/selenium/webdriver/remote/webdriver.py", line 662, in find_element
    {'using': by, 'value': value})['value']
  File "/usr/local/lib/python3.4/dist-packages/selenium/webdriver/remote/webdriver.py", line 173, in execute
    self.error_handler.check_response(response)
  File "/usr/local/lib/python3.4/dist-packages/selenium/webdriver/remote/errorhandler.py", line 164, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: 'Unable to locate element: {"method":"id","selector":"mSearchControl_mParcelID"}' ; Stac
ktrace: 
    at FirefoxDriver.findElementInternal_ (file:///tmp/tmpiuuqg3m7/extensions/[email protected]/components/driver_component.js:9470)
    at FirefoxDriver.findElement (file:///tmp/tmpiuuqg3m7/extensions/[email protected]/components/driver_component.js:9479)
    at DelayedCommand.executeInternal_/h (file:///tmp/tmpiuuqg3m7/extensions/[email protected]/components/command_processor.js:11455)
    at DelayedCommand.executeInternal_ (file:///tmp/tmpiuuqg3m7/extensions/[email protected]/components/command_processor.js:11460)
    at DelayedCommand.execute/< (file:///tmp/tmpiuuqg3m7/extensions/[email protected]/components/command_processor.js:11402) 
03:13 ~/Tax_Scrape $ 

Selenium Innitation:

for retry in range(3):
    try:
        driver = webdriver.Firefox()
        break
    except:
        time.sleep(3)

for parcel_code in token:
    getLinks("https://ascendweb.jacksongov.org/ascend/%28biohwjq5iibvvkisd1kjmm4 5%29/result.aspx")

PythonAnywhere uses a virtual instance of FireFox that is suppose to be headless like JSPhantom so I do not have a version number.

Any help would be great

RS

0

1 Answer 1

5

Well, maybe the browser used on PythonAnywhere does not load the site fast enough. So instead of time.sleep(3) try implicitly waiting for the element.

An implicit wait is to tell WebDriver to poll the DOM for a certain amount of time when trying to find an element or elements if they are not immediately available. The default setting is 0. Once set, the implicit wait is set for the life of the WebDriver object instance.

Using time.sleep() with Selenium is not a good idea in general.

And give it more than just 3 second, with implicitly_wait() you specify the maximum time spent waiting for an element.
So if you set implicitly_wait(10) and the page loads, for example, in 5 seconds then Selenium will wait only 5 seconds.

driver.implicitly_wait(10)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.