0

I'm trying to capture the links of a webpage using Selenium in Python. My initial code is:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import pandas as pd
import time
from tqdm import tqdm
from selenium.common.exceptions import NoSuchElementException
driver.get('https://www.lovecrave.com/shop/')

Then, I identified all the products (12) in the web by using:

perso_flist = driver.find_elements_by_xpath("//p[@class='excerpt']")

Then, I want to capture the links for each product by using:

listOflinks = []
for i in perso_flist:
    link_1=i.find_elements_by_xpath(".//a[@href[1]]")
    listOflinks.append(link_1)
print(listOflinks

And my output looks like:

print(listOflinks)  # 12 EMPTY VALUES
[[], [], [], [], [], [], [], [], [], [], [], []]

What is wrong with my code? I'll appreciate your help.

3
  • The xpath in this line is not matching anything i.find_elements_by_xpath(".//a[@href[1]]") Commented Feb 11, 2021 at 15:31
  • Please add the html of an element matched with the xpath "//p[@class='excerpt']" Commented Feb 11, 2021 at 15:34
  • Thanks Jortega. How could I fix this issue in my example? Commented Feb 11, 2021 at 15:34

2 Answers 2

1

Basically you loop through the a tags and get the attribute href.

hrefs=[x.get_attribute("href") for x in driver.find_elements_by_xpath("//p[@class='excerpt']/following-sibling::a[1]")]
print(hrefs)

or xpath //li/a[@class='full-link']

Outputs

['https://www.lovecrave.com/products/duet-pro/',
 'https://www.lovecrave.com/products/vesper/',
 'https://www.lovecrave.com/products/wink/',
 'https://www.lovecrave.com/products/duet/',
 'https://www.lovecrave.com/products/duet-flex/',
 'https://www.lovecrave.com/products/flex/',
 'https://www.lovecrave.com/products/pocket-vibe/',
 'https://www.lovecrave.com/products/bullet/',
 'https://www.lovecrave.com/products/cuffs/',
 'https://www.lovecrave.com/shop/gift-card/',
 'https://www.lovecrave.com/shop/leather-case/',
 'https://www.lovecrave.com/shop/vesper-replacement-charger/']
Sign up to request clarification or add additional context in comments.

Comments

1

I am making some assumptions about this xpath //p[@class='excerpt'] if below does not work please add an htlm example of the element.

You can get a list of link element by making this update:

perso_flist = driver.find_elements_by_xpath("//li//a[@class='full-link']")

Then loop through the list using element.get_attribute()

listOflinks = []
for i in perso_flist:
    link_1=i.get_attribute("href")
    listOflinks.append(link_1)
print(listOflinks)

4 Comments

Thanks, Jortega. Adding //a my output is still empty. What other xtml element may I use?
@mlosada In addition to adding //a also update your loop. I need to know what the output of i.get_attribute('innerHTML') returns. I will also add this line to the answer.
My output is a short description of each product taken from the page of products and: [None, None, None, None, None, None, None, None, None, None, None, None] My list of links is still empty.
as I said before, applying the updated code, I get the description of the products and my list of links remains empty.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.