Clicking loading button with selenium doesnt work

Question

I try to load all comments from this site to scrape them but i cant figure out how to load them all.

When i run my code i get error in console it says:

WebDriverWait(driver, 20).until(EC.element_to_be_clickable( File "C:\Users\Jakub\dev\rok_quests\rok_quests\Lib\site-packages\selenium\webdriver\support\wait.py", line 95, in until raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message:

Doesnt it mean it doesnt find a button or it cant click on it ?

the url i use:

https://www.rok.guide/buildings/lyceum-of-wisdom/

The code here is meant to load all comments from comments section then i will get page_source and scrape .

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
import time


def scrape_comments(url):
    # Set up Chrome driver options
    chrome_options = Options()
    chrome_options.add_argument("start-maximized")
    chrome_options.add_argument('disable-infobars')
    chrome_options.add_argument("--block-notifications")
    chrome_options.add_argument("--headless")
    driver = webdriver.Chrome(options=chrome_options)
    wait = WebDriverWait(driver, 10)
    comments = []
    try:
        # Open the website
        driver.get(url)
        get_url = driver.current_url

        wait.until(EC.url_to_be(url))

        if get_url != url:
            raise Exception('Site url doesnt match')
        WebDriverWait(driver, 20).until(EC.visibility_of_element_located(
            (By.CSS_SELECTOR, ".wpd-comment-text")))

        while True:
            try:
                WebDriverWait(driver, 20).until(EC.element_to_be_clickable(
                    (By.XPATH, "/html/body/div[3]/div/div[1]/main/div/div[2]/div[2]/div[3]/div[3]/div[51]/div/button"))).click()
                print("clicked")
            except TimeoutError:
                print("No more to load")
                break

        print(driver.page_source)
        return comments
    finally:
        # Close the web driver
        driver.quit()

but it doesnt work Saying that something "doesn't work" is a poor description of the problem. Instead, tell us what the code actually does, and explain what you wanted instead. — John Gordon
– John Gordon, Commented May 27, 2023 at 17:51
@JohnGordon thanks for suggestions i tried to explain my problem hope it is understandable — IvonaK
– IvonaK, Commented May 27, 2023 at 18:12

Andrej Kesely · Accepted Answer · 2023-05-27 20:09:16Z

To use beautifulsoup to load the comments you can use next example:

import requests
from bs4 import BeautifulSoup

api_url = 'https://www.rok.guide/wp-admin/admin-ajax.php'

headers = {
    'X-Requested-With': 'XMLHttpRequest',
    'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/113.0'
}

multipart_form_data = {
    'action': (None, 'wpdLoadMoreComments'),
    'sorting': (None, 'newest'),
    'offset': (None, '0'),
    'lastParentId': (None, '0'),
    'isFirstLoad': (None, '1'),
    'wpdType': (None, ''),
    'postId': (None, '3568'),
    'wpdiscuz_nonce': (None, ''),
}

ofs = 0
while True:
    data = requests.post(api_url, headers=headers, files=multipart_form_data).json()
    soup = BeautifulSoup(data['data']['comment_list'], 'html.parser')

    for c in soup.select('.comment'):
        print(c.select_one('.wpd-comment-text').get_text(strip=True, separator='\n'))
        print('-'*80)

    if not data['data']['is_show_load_more']:
        break

    multipart_form_data['lastParentId'] = (None, data['data']['last_parent_id'])
    ofs += 1
    multipart_form_data['offset'] = (None, ofs)

Prints:


...

Q: In RoK, the Throwing Axeman is which civilization’s special unit?
A: France
--------------------------------------------------------------------------------
Q: In Ark of Osiris, how many teleports does the first alliance to occupy an obelisk earn?
A: 8
--------------------------------------------------------------------------------
Q: Which of the following is not a natural resource?
A: Clothes
--------------------------------------------------------------------------------
Q: French National Day is on July 14 in order to coincide with which historical event?
A: Storming of the Bastille
--------------------------------------------------------------------------------
Q: In Ark of Osiris, how many teleports does the first alliance to occupy an obelisk earn?
A: 8
--------------------------------------------------------------------------------

Collectives™ on Stack Overflow

Clicking loading button with selenium doesnt work

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related