Traceback Error while using selenium with python beautifulsoup library

Question

I m using this code for scrapping some data from the link https://website.grader.com/results/www.dubizzle.com. Because the actual script with the tags i want to extract loads after a 15 seconds of load, someone recommended me selemuim to introduce a delay in the code. Hence I use this code

The code is as below

#!/usr/bin/python
import urllib
import time
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from bs4 import BeautifulSoup
from dateutil.parser import parse
from datetime import timedelta
import MySQLdb
import re
import pdb
import sys
import string



driver = webdriver.Firefox()
driver.get('https://website.grader.com/results/dubizzle.com')
time.sleep(25)
html = driver.page_source
soup  = BeautifulSoup(html)


# print soup

Sizeofweb=""
try:

    Sizeofweb= soup.find('span', {'data-reactid': ".0.0.3.0.0.3.$0.1.1.0"}).text
    print Sizeofweb.get_text().encode("utf-8")

except StandardError as e:
    converted_date="Error was {0}".format(e)
    print converted_date

The part of the html which i am extracting is as below

Snap: https://www.dropbox.com/s/7dwbaiyizwa36m6/5.PNG?dl=0

<div class="result-value" data-reactid=".0.0.3.0.0.3.$0.1.1">
<span data-reactid=".0.0.3.0.0.3.$0.1.1.0">1.1</span>
<span class="result-value-unit" data-reactid=".0.0.3.0.0.3.$0.1.1.1">MB</span>
</div>

I installed the geckodriver by downloading it from here and extracting it to /home directory and then giving it a path export PATH=$PATH:/home/geckodriver as recommended by someone named @Ahn Smith here

Now when i run the program, it gives this error

Traceback (most recent call last):
  File "ahmed.py", line 17, in <module>
    driver = webdriver.Firefox()
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/firefox/webdriver.py", line 140, in __init__
    self.service.start()
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/common/service.py", line 74, in start
    stdout=self.log_file, stderr=self.log_file)
  File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1327, in _execute_child
    raise child_exception
OSError: [Errno 20] Not a directory

user2096803 · Accepted Answer · 2016-11-29 13:41:28Z

1

There are two ways to point Selenium to the appropriate webdriver. You can pass it as a parameter:

driver = webdriver.Firefox(executable_path='/path/to/geckodriver')

Or you can create a local shell variable containing the PATH:

$ export PATH=$PATH:/path/to/

I think your problem is that you're exporting a PATH variable to the geckodriver and not to the folder containing it.

answered Nov 29, 2016 at 13:41

user2096803

Sign up to request clarification or add additional context in comments.

4 Comments

info Over a year ago

is the path/to/ should be as it is? the geckodriver is in home directory so should i put driver = webdriver.Firefox(executable_path='/path/to/geckodriver')

user2096803 Over a year ago

If passing as an executable_path parameter you should include the full path including the geckodriver. If adding a PATH variable to your shell you should include only the path to the directory containing the geckodriver.

info Over a year ago

Please check my answer for the code and error. i have put the geckodriver in the home directory

user2096803 Over a year ago

Try a reboot. If that fails, try checking your permissions to make sure the geckodriver is executable. (Or try copying it to /usr/local/sbin/)

Collectives™ on Stack Overflow

Traceback Error while using selenium with python beautifulsoup library

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related