0

I'm scraping a website of a university's enrollment system. Each page links to many other pages. Each link has an ID of SEC_SHORT_TITLE_x where x is an integer from 1-20. Once on each of those pages, I'd like to scrape a few pieces of data. Right now I'm just trying to scrape the section name. Will handle the logic for going back a page and clicking the next link after this.

DevTools showing xPath: enter image description here

for y in range(1):
    for j in range(1,2):
        if browser.find_elements_by_xpath("//a[@id='SEC_SHORT_TITLE_" + str(j) + "']"):
            #outputstring = ''
            browser.find_elements_by_xpath("//a[@id='SEC_SHORT_TITLE_" + str(j) + "']").click()
            time.sleep(10)
            section = browser.find_elements_by_xpath("//p[@id='VAR2']")
            print(section)

The script navigates to the proper page that contains all the links but isn't able to click on the first link as it should.

[7756:2296:0923/141749.015:ERROR:ssl_client_socket_impl.cc(941)] handshake failed; returned -1, SSL error code 1, net_error -100

2
  • 2
    Does this discussion helps you? Commented Sep 24, 2019 at 11:52
  • Yes that explained the problem - thank you @DebanjanB Commented Sep 24, 2019 at 18:47

1 Answer 1

1

Based on the error message you provided (SyntaxError: Failed to execute 'evaluate' on 'Document': The string '//a[@id='SEC_SHORT_TITLE_1]' is not a valid XPath expression. (Session info: chrome=77.0.3865.90)), it looks like your XPath syntax is incorrect. You need to add a closing ' mark inside the square brackets.

Change //a[@id='SEC_SHORT_TITLE_1]

To //a[@id='SEC_SHORT_TITLE_1']

Notice how I added a single ' mark after 'SEC_SHORT_TITLE_1'.

Based on your code sample, you'll need to update this line by changing:

browser.find_elements_by_xpath("//a[@id='SEC_SHORT_TITLE_" + str(j) + "]"):

to:

browser.find_elements_by_xpath("//a[@id='SEC_SHORT_TITLE_" + str(j) + "']"):

I've added a single ' mark before your closing square bracket to correct the XPath syntax.

Sign up to request clarification or add additional context in comments.

3 Comments

Apologies, I must have posted an older error message, as I had already fixed that issue. Will rerun and post new error messages. Thanks for response
No problem, let me know what you find. Happy to help.
@Sartorialist This might have something to do with the website requiring SSL. You can try to bypass this by adding --ignore-certificate-errors and --ignore-ssl-errors to ChromeOptions() when you initialize the WebDriver. More information can be found here stackoverflow.com/questions/37883759/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.