Python Selenium find_element_by_xpath

Question

I want to extract text "3351500920037" from the following code:

<div class="specs">
    <h3 class="h4">Productinformatie</h3>
    <dl class="specs__list">

        <dt class="specs__title">
        Gewicht

      </dt>
        <dd class="specs__value">

            0,3 kg

        </dd>

        <dt class="specs__title">
        EAN

      </dt>
        <dd class="specs__value">

            3351500920037

        </dd>

    </dl>
</div>

I use

ref_code = driver.find_element_by_xpath('//*[contains(text(),"EAN")]/following-sibling::dd').text

When I print ref_code seems taking the first line of the text only. It appears empty.

What I have:

print(ref_code)

I would like to have:

print(ref_code)
3351500920037

How can I take the whole text including next lines?

Add result of printing ref_code and your expectations. Add please HTML in text format. — Sers
– Sers, Commented Sep 28, 2019 at 7:40
no buddy will write html structure from the image for you unless you do. please add the html structure as code in post so that people can reproduce the problem — Dev
– Dev, Commented Sep 28, 2019 at 9:29
Sorry about that. I just edited my question without images. Thanks — Henri
– Henri, Commented Sep 28, 2019 at 10:56

Sers · Accepted Answer · 2019-09-28 15:30:18Z

2

Here is code how you can get all EAN numbers from first search page. You can improve code by go through all pages first to collect all links:

import selenium, csv, sys, time
from oauth2client.service_account import ServiceAccountCredentials
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options

driver = webdriver.Chrome('/usr/local/bin/chromedriver')
wait = WebDriverWait(driver, 20)

query = "Azzaro Chrome 100 ml"
driver.get("https://www.bol.com")

driver.find_element_by_id("searchfor").send_keys(query, u'\ue007')

# wait presence and get all product A elements
products = wait.until(ec.presence_of_all_elements_located((By.CSS_SELECTOR, "li.product-item--row a.product-title")))
# get HREF attribute from products
product_links = [product.get_attribute("href") for product in products]

# iterate through and open all product links, and get ref_code
for link in product_links:
    driver.get(link)
    ref_code = driver.find_element_by_css_selector("a[data-ean]").get_attribute("data-ean")
    print(ref_code)

edited Sep 28, 2019 at 15:30

answered Sep 28, 2019 at 11:06

Sers

12.3k2 gold badges14 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Henri Over a year ago

Thanks, but I need it in Python only

Henri Over a year ago

Except I'm mistaken, I've got the same issue as my initial code I got the following error: AttributeError: 'visibility_of_element_located' object has no attribute 'text'

Sers Over a year ago

All you need is open browser and navigate to the URL, then just copy my code without any modifications from you side. If still you'll get error, update you question with your full code

Henri Over a year ago

My navigation to the URL is perfect, then, I use exactly your code, and I got the following error: selenium.common.exceptions.TimeoutException: Message:

Sers Over a year ago

Let us continue this discussion in chat.

Corey Goldberg · Accepted Answer · 2019-09-30 13:25:01Z

The item is Not visible on the page that is why visibility_of_element_located() is getting timeout exception.

To extract text 3351500920037 you need to induce WebDriverWait and presence_of_element_located() and get_attribute('textContent') it will gives the result you are looking for.

print(WebDriverWait(driver,20).until(EC.presence_of_element_located((By.XPATH, "//*[contains(.,'EAN')]/following-sibling::dd[1]"))).get_attribute('textContent'))

This is the full code:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()
driver.get("https://www.bol.com/")
query='Azzaro Chrome 100 ml'
searchelement=WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.ID,"searchfor")))
searchelement.send_keys(query)
searchelement.submit()
WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,".product-title.px_list_page_product_click"))).click()
print(WebDriverWait(driver,20).until(EC.presence_of_element_located((By.XPATH, "//*[contains(.,'EAN')]/following-sibling::dd[1]"))).get_attribute('textContent'))
driver.quit()

Collectives™ on Stack Overflow

Python Selenium find_element_by_xpath

2 Answers 2

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related