1

I'm trying to pull some text from a webpage. The page source that I want to pull data from is:

<tbody>
    <tr class="drx_dotted">
        <td class="drx_first">
            <span name="pharmacy"
                  longitude="-82.531457"
                  latitude="42.617612"
                  pharmacyname="CVS Pharmacy #"
                  address="1025 St Clair River Dr"
                  city="Algonac"
                  state="MI"
                  zip="48001"
                  phone="8107944941">
            </span>
            <p>
                <strong>CVS Pharmacy #</strong><br />
                1025 St Clair River Dr<br />
                Algonac, MI 48001<br />
                1-810-794-4941
            </p>
            <p>
                <a class=""
                   data-ajax="true"
                   data-ajax-method="post"
                   data-ajax-success="UpdateSearchPharmacyList"
                   href="/pfdn/SharedPharmacy/AddNetworkPharmacy?pharmacyNABP=2352324&amp;language=English">Add Pharmacy
                    <span class='HiddenText'> CVS Pharmacy #</span>
                </a>
            </p>
        </td>
        <td>
            <p>
                Retail
            </p>
        </td>
        <td>
            <p>
                Not applicable
            </p>
        </td>
    </tr>

I want to pull the "Not applicable" near the bottom of the HTML code. It is the "p" in the third "td" in the HTML source code. There are also a bunch of these, so I want to pull all these tags at once into a python list.

Here is the selenium code I'm using to find the HTML:

x = driver.find_elements_by_xpath(
    '//[@id="divSearchResultContainer"]/div[2]/div[2]/table/tbody/tr/td[3]/p')

When I type print(x) it prints out this:

[<selenium.webdriver.remote.webelement.WebElement object at 0x101f98210>,
 <selenium.webdriver.remote.webelement.WebElement object at 0x101f98250>,
 <selenium.webdriver.remote.webelement.WebElement object at 0x101f98290>]

So selenium has found and pull three instances (which is correct, it was supposed to find three). However, when I try to pull the text using;

print x[0].text

the output is:

None

I've tried a bunch of variations, even trying to find each element individually, but it's still not working. Has anyone had this problem? How can I resolve it?

Thanks

3
  • 1
    What are you expecting the text output of print x[0].text to be? Commented Aug 5, 2014 at 19:49
  • I don't know much about selenium, but print dir(x) and see what is valid for it Commented Aug 5, 2014 at 19:52
  • Wow, sorry I made a mistake. The output for print x[0].text is "None", but it should be "Not applicable" Thanks for pointing that out, Richard. Commented Aug 5, 2014 at 20:01

2 Answers 2

1

The problem is that you have multiple tr tags, get the appropriate one. Use find_element_by_xpath() to find a single element instead of a list and use the following xpath:

'//[@id="divSearchResultContainer"]/div[2]/div[2]/table/tbody/tr[1]/td[3]/p

The python code:

element = driver.find_elements_by_xpath(
'//[@id="divSearchResultContainer"]/div[2]/div[2]/table/tbody/tr[1]/td[3]/p')

Note the [1] after the tr. This is how we are saying to look at the first tr tag only.


Also note that the xpath you have looks fragile - this is because of the use of indexing: give me second div in this div, and then second div in that etc. Posting the complete contents of the element with divSearchResultContainer id would help to provide your with a better solution.

Sign up to request clarification or add additional context in comments.

Comments

0

Try this for xpath, I haven't tested but xpath has last() operator which is what you want.

"//tbody//tr//td[last()]/p[last()]/text()"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.