6

I am trying to build a proxy scraper for a specific site, but I'm failing on move to next page.

This is the code that I'm using.

If you answer my question, please, explain me a bit about what you used and if you can, please, if there are any good tutorials for over this kind of code, provide me some:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time

options = Options()
#options.headless = True     #for headless
#options.add_argument('--disable-gpu') #for headless and os win

driver = webdriver.Chrome(options=options)

driver.get("https://hidemyna.me/en/proxy-list/")
time.sleep(10) #bypass cloudflare


tbody = driver.find_element_by_tag_name("tbody")
cell = tbody.find_elements_by_tag_name("tr")

for column in cell:
    column = column.text.split(" ")
    print (column[0]+":"+ column[1]) #ip and port

nxt = driver.find_element_by_class_name('arrow_right')
nxt.click()
2
  • 1
    Try nxt = driver.find_element_by_css_selector('.arrow__right>a'). Note that there are two underscores in class name Commented Dec 28, 2018 at 9:12
  • Post the relevant HTML here. Commented Dec 28, 2018 at 17:09

3 Answers 3

5

To move to the next page you can try the following solution:

  • Code Block:

    from selenium import webdriver
    from selenium.webdriver.chrome.options import Options
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.common.exceptions import TimeoutException, WebDriverException
    
    options = Options()
    options.add_argument("start-maximized")
    options.add_argument("disable-infobars")
    options.add_argument("--disable-extensions")
    driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
    driver.get('https://hidemyna.me/en/proxy-list/')
    while True:
        try:
            driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//li[@class='arrow__right']/a"))))
            driver.find_element_by_xpath("//li[@class='arrow__right']/a").click()
            print("Navigating to Next Page")
        except (TimeoutException, WebDriverException) as e:
            print("Last page reached")
            break
    driver.quit()
    
  • Console Output:

    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    Navigating to Next Page
    .
    .
    .
    Navigating to Next Page
    Last page reached
    
Sign up to request clarification or add additional context in comments.

4 Comments

@CodeIt Can you explain why this will not work on page 2?
@DebanjanB , on second page your selector will match Previous button instead of Next
tried the above but it is not working on this site. the error output is selenium.common.exceptions.WebDriverException: Message: unknown error: Element <a href="/en/proxy-list/?start=64#list"></a> is not clickable at point (874, 563). Other element would receive the click: <jdiv class="hoverl_6R"></jdiv>
@Haunter Checkout my answer update and let me know the status.
0

You are not actually clicking the anchor <a> tag. To navigate to next page you need to click on the <a> link.

You can use find_element_by_xpath like below.

driver.find_element_by_xpath('//*[@id="content-section"]/section[1]/div/div[4]/ul/li[1]/a').click()

Instead of using xpath you may use css selector as suggested by another @Andersson.

Comments

0

the next button tend to vary from webpage to webpage...you will have to inspect the button and address it with xpath or beaufifulsoup

There's usually 'next page' and 'previous page'...address your xpath to 'next'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.