3

Note: I spend more than one hour trying to solve this issue and found no solution that worked for me.

At the end it turned out to be a very simple mistake, but I thought I will create the question so in case anybody else has the same issue can find a solution fast.


Problem

I was trying to scrape a site with the following code:

phantomjs_path = '/Users/xxx/xxx/phantomjs-2.1.1-macosx/bin/phantomjs'

driver = webdriver.PhantomJS(executable_path=phantomjs_path)

driver.set_window_size(1024, 768) #optional

driver.get(url)

# wait
element = WebDriverWait(driver, 20).until(
EC.presence_of_element_located((By.CLASS_NAME, "flightrow")))

response = driver.find_element_by_css_selector('table[class="flighttable"]')

driver.quit()

html = response.get_attribute('outerHTML') #pass from webdrive object to string

And was getting the following error:

Traceback (most recent call last):


File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 1254, in do_open
    h.request(req.get_method(), req.selector, req.data, headers)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1106, in request
    self._send_request(method, url, body, headers)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1151, in _send_request
    self.endheaders(body)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1102, in endheaders
    self._send_output(message_body)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 934, in _send_output
    self.send(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 877, in send
    self.connect()
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 849, in connect
    (self.host,self.port), self.timeout, self.source_address)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/socket.py", line 711, in create_connection
    raise err
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/socket.py", line 702, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 61] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "my_script.py", line 1251, in <module>
    MyObject.script_main()
  File "my_script.py", line 1232, in script_main
    self.parse_js(url)
  File "my_script.py", line 1202, in parse_js
    print('response:', response.text)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/selenium/webdriver/remote/webelement.py", line 68, in text
    return self._execute(Command.GET_ELEMENT_TEXT)['value']
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/selenium/webdriver/remote/webelement.py", line 461, in _execute
    return self._parent.execute(command, params)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 234, in execute
    response = self.command_executor.execute(driver_command, params)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/selenium/webdriver/remote/remote_connection.py", line 401, in execute
    return self._request(command_info[0], url, body=data)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/selenium/webdriver/remote/remote_connection.py", line 471, in _request
    resp = opener.open(request, timeout=self._timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 466, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 484, in _open
    '_open', req)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 444, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 1282, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 1256, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 61] Connection refused>

Loading the url manually in the Chrome browser was working.

Anyway, I tried switching the url from https to http, but I still got the same error.

In addition, during the previous day I did not get any error, so I assumed it could not be a problem with firewalls, as I read in some other questions.

See answer for the solution...

1 Answer 1

3

It turned out that apparently I had moved the line driver.quit() upwards, so the error was raised when calling 'get_atribute'.

Solution

Just move driver.quit() downwards:

driver = webdriver.PhantomJS(executable_path=phantomjs_path)

driver.set_window_size(1024, 768) #optional

driver.get(url)

# wait
element = WebDriverWait(driver, 20).until(
EC.presence_of_element_located((By.CLASS_NAME, "flightrow")))

response = driver.find_element_by_css_selector('table[class="flighttable"]')

html = response.get_attribute('outerHTML') #pass from webdrive object to string

#do not move quite() upwards! even if 'driver' is not specifically called with the command 'get_attribute'
#it will raise an error if driver is closed.
driver.quit()
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.