0

I am using Selenium for Python to scrape a site with multiple pages. To get to the next page, I use driver.find_element(By.XPATH, xpath). However, The xpath text changes. So, instead, I want to use other attributes.

I tried to find by class, using "page-link": driver.find_element(By.CLASS_NAME, "page-link". However, the "page-link" class is also present in the disabled list item. As a result, the Selenium driver won't stop after the last page, in this case page 2.

I want to stop the driver clicking the disabled item on the page, i.e. I want it to ignore the last item in the list, the one with "page-item disabled", aria-disabled="true" and aria-hidden="true". The idea is that if the script can't find that item, it will end a while loop that relies on the ">" button to be enabled.

See the source code below.

Please advise.

<nav>
<ul class="pagination">
<li class="page-item">
    <a class="page-link" href="https://www.blucap.net/app/FlightsReport?fromdate=2023-02-01&amp;todate=2023-02-28&amp;filterByMemberId=&amp;view=View%20Report&amp;page=1" rel="prev" aria-label="&laquo; Previous">&lsaquo;</a>
</li>
<li class="page-item">
    <a class="page-link" href="https://www.blucap.net/app/FlightsReport?fromdate=2023-02-01&amp;todate=2023-02-28&amp;filterByMemberId=&amp;view=View%20Report&amp;page=1">1</a>
</li>
<li class="page-item active" aria-current="page">
    <span class="page-link">2</span>
</li>
<li class="page-item disabled" aria-disabled="true" aria-label="Next &raquo;">
    <span class="page-link" aria-hidden="true">&rsaquo;</span>
</li>
</ul>
</nav>

2 Answers 2

0

To go to the Next Page there can be a couple of approaches:

  • You can opt to find_element() and click it's descendant <span> of the <li> with aria-label="Next &raquo;" but doesn't contains aria-disabled="true" as follows:

    driver.find_element(By.XPATH, "//li[starts-with(@aria-label, 'Next') and not(@aria-disabled='true')]/span").click()
    
Sign up to request clarification or add additional context in comments.

Comments

0

Alas, the solution offered did not work.

In the end I decided to rely on the http links and, using regex, extract the pages with page larger than 1 (as the first page is the starting page). Works like a charm.

Thanks for any effort, much appreciated.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.