2

I'm trying to grab some data using Selenium and xpaths.

The following xpath works fine:

print sel.get_attribute("xpath=(//*[@id='course_list']/*[@class='series'])[4]//*[@class='series_links']//a/@href")

and returns 4 matching URL's. So far so good.

The issue is that I want to write an xpath to target each URL individually.

Using the Firefox xpath checker plugin, I have managed to confirm that the following code does exactly what I need:

((//*[@id='course_list']/*[@class='series'])[4]//*[@class='series_links']//a/@href)[1]

But despite working in the Firefox xpath checker, I can't seem to get this to work in Selenium.

Whenever I try to execute:

print sel.get_attribute("xpath=((//*[@id='course_list']/*[@class='series'])[4]//*[@class='series_links']//a/@href)[1]")

I get the following error:

Exception: ERROR: Invalid xpath [2]: ((//*[@id='course_list']/*[@class='series'])[4]//*[@class='series_links']//a

Not sure what's going on here. Am I making a simple mistake, or do Selenium xpath's not support nested brackets like the FF xpath checker does?

Any thoughts would be most appreciated, as I've been working on this for hours and can't seem to make it work :(

3
  • Ah :( any thoughts on how I might solve my problem by another means? If I run print sel.get_attribute("xpath=(//*[@id='course_list']/*[@class='series'])[4]//*[@class='series_links']//a/@href") it only prints one result despite that xpath matching 4 different URL's. Commented Sep 16, 2012 at 13:43
  • @MartijnPieters, No, This is a syntactically and semantically correct XPath 1.0 expression. Commented Sep 16, 2012 at 16:05
  • @DimitreNovatchev: Okay, wrong guess then. Commented Sep 16, 2012 at 17:15

2 Answers 2

1

This is, again, not an answer to your question. But, I never use xpaths like this. If the webpage writer was smart enough to use classes, he's also smart enough to be able to change the structure of the webpage and keep those classes.

from selenium import webdriver
driver = webdriver.Chrome() 

series = driver.find_element_by_class_name("series")
series_links = [i.get_attribute('href') for i in series.find_elements_by_class_name("series_links")]

driver.quit() # call this when you're done using the webdriver.
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the reply! Do I need to be using webdriver for this to work? At the moment I think I'm just using RC ('from selenium import selenium'). When I try to run your code I get: 'NameError: global name 'driver' is not defined'
Yes, this is a webdriver function. I'll edit the post to include the imports and stuff, to be fully functional.
0

Not really an answer to my question, but I did find a workaround for those who might come across a similar problem.

Selenium's get_xpath_count command allows for relatively painless xpath validation. If you specify an incorrect xpath (or one that doesn't exist), the command will simply return a zero ('0').

So I'm now using a simple 'if' statement to verify an xpath exists before running the get_attribute command:

if sel.get_xpath_count("(//*[@class='series_links'])[" + str(data) + "]//*[@class='youtube']") > 0:
    print sel.get_attribute("xpath=(//*[@id='course_list']/*[@class='series'])[" + str(data) +"]//*[@class='youtube']//a/@href")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.