19

I have a list of URLs in a .txt file that I would like to run using selenium.

Lets say that the file name is b.txt in it contains 2 urls (precisely formatted as below): https://www.google.com/,https://www.bing.com/,

What I am trying to do is to make selenium run both urls (from the .txt file), however it seems that every time the code reaches the "driver.get" line, the code fails.

url = open ('b.txt','r')
url_rpt = url.read().split(",")
options = Options()
options.add_argument('--headless')
options.add_argument('--disable-gpu')
driver = webdriver.Chrome(chrome_options=options)
for link in url_rpt:
   driver.get(link)
driver.quit()

The result that I get when I run the code is

Traceback (most recent call last):
File "C:/Users/ASUS/PycharmProjects/XXXX/Test.py", line 22, in <module>
driver.get(link)
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python38\lib\site- 
packages\selenium\webdriver\remote\webdriver.py", line 333, in get
self.execute(Command.GET, {'url': url})
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python38\lib\site- 
packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python38\lib\site- 
packages\selenium\webdriver\remote\errorhandler.py", line 242, in 
check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidArgumentException: Message: invalid 
argument
(Session info: headless chrome=79.0.3945.117)

Any suggestion on how to re-write the code?

7
  • 1
    What do you mean by "fails?" Are you getting an exception? If so, what is the message and stacktrace? We need this basic info. Commented Jan 15, 2020 at 16:24
  • 1
    In the for loop above driver.get(link) add a line print(link). Commented Jan 15, 2020 at 16:24
  • When "the code fails" what do you mean? What is the error message? What happens if you just run for url in url_rpt: print(url). This might not be an issue with Selenium, but possibly with the url input and reading strategy. It would help to narrow down whether or not Selenium is truly throwing the error, or if the issue is with the file. Commented Jan 15, 2020 at 16:25
  • I'll update this on the post. Commented Jan 15, 2020 at 16:40
  • @Christine: Thanks! If I runa for url in url_rpt: print (ur) it would return both links just fine. Commented Jan 15, 2020 at 16:48

4 Answers 4

18

This error message...

Traceback (most recent call last):
  .
    driver.get(link)
  .
    self.execute(Command.GET, {'url': url})
  .
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidArgumentException: Message: invalid argument
  (Session info: chrome=79.0.3945.117)

...implies that the url passed as an argument to get() was an argument was invalid.

I was able to reproduce the same Traceback when the text file containing the list of urls contains a space character after the seperator of the last url. Possibly a space character was present at the fag end of b.txt as https://www.google.com/,https://www.bing.com/,.


Debugging

An ideal debugging approach would be to print the url_rpt which would have revealed the space character as follows:

  • Code Block:

    url = open ('url_list.txt','r')
    url_rpt = url.read().split(",")
    print(url_rpt)
    
  • Console Output:

    ['https://www.google.com/', 'https://www.bing.com/', ' ']
    

Solution

If you remove the space character from the end your own code would execute just perfecto:

options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
url = open ('url_list.txt','r')
url_rpt = url.read().split(",")
print(url_rpt)
for link in url_rpt:
   driver.get(link)
driver.quit()
Sign up to request clarification or add additional context in comments.

4 Comments

Realized that there is a comma at the end of the list! Thanks a lot for highlighting this!!
I encountered the same error when i forgot to start the url with https://
Same as @philomath I was getting that exception on driver.get() function and I solved it by using http:// as a prefix ( http:// localhost in my case)
I was adding a list using a multi line string inside a function, calling .splitlines() on it, and it was counting the indentation as a new array element with four spaces. Thank you!
2

I also faced a similar issue, where Selenium errored out while opening the URL and printed below message:

selenium.common.exceptions.InvalidArgumentException: Message: invalid argument
  (Session info: MicrosoftEdge=91.0.852.0)

On closely looking, i found that my url string was in 'UTF-8' and contained a leading ZWNBSP character, because of which selenium was not able to accept the URL(I was reading list of urls from a file, which caused this). IMO, selenium should have reported the error better(saying URL argument was invalid).

To rectify the issue, i used below code to clean my URL:

url = url.encode('ascii', 'ignore').decode('unicode_escape')

1 Comment

FYI, these extra characters(e.g. ZWNBSP ) might not be visible if we just print the URLs to check.
0

This worked for me:

link = 'https://' + link

driver.get(link)

Comments

0

If your link is a local HTML file, try this:

link = 'files://' + link
driver.get(link)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.