Using selenium in python for capturing the links in a web

Question

I'm trying to capture the links of a webpage using Selenium in Python. My initial code is:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import pandas as pd
import time
from tqdm import tqdm
from selenium.common.exceptions import NoSuchElementException
driver.get('https://www.lovecrave.com/shop/')

Then, I identified all the products (12) in the web by using:

perso_flist = driver.find_elements_by_xpath("//p[@class='excerpt']")

Then, I want to capture the links for each product by using:

listOflinks = []
for i in perso_flist:
    link_1=i.find_elements_by_xpath(".//a[@href[1]]")
    listOflinks.append(link_1)
print(listOflinks

And my output looks like:

print(listOflinks)  # 12 EMPTY VALUES
[[], [], [], [], [], [], [], [], [], [], [], []]

What is wrong with my code? I'll appreciate your help.

The xpath in this line is not matching anything i.find_elements_by_xpath(".//a[@href[1]]") — Jortega
– Jortega, Commented Feb 11, 2021 at 15:31
Please add the html of an element matched with the xpath "//p[@class='excerpt']" — Jortega
– Jortega, Commented Feb 11, 2021 at 15:34

Arundeep Chohan · Accepted Answer · 2021-02-11 19:59:13Z

1

Basically you loop through the a tags and get the attribute href.

hrefs=[x.get_attribute("href") for x in driver.find_elements_by_xpath("//p[@class='excerpt']/following-sibling::a[1]")]
print(hrefs)

or xpath //li/a[@class='full-link']

Outputs

['https://www.lovecrave.com/products/duet-pro/',
 'https://www.lovecrave.com/products/vesper/',
 'https://www.lovecrave.com/products/wink/',
 'https://www.lovecrave.com/products/duet/',
 'https://www.lovecrave.com/products/duet-flex/',
 'https://www.lovecrave.com/products/flex/',
 'https://www.lovecrave.com/products/pocket-vibe/',
 'https://www.lovecrave.com/products/bullet/',
 'https://www.lovecrave.com/products/cuffs/',
 'https://www.lovecrave.com/shop/gift-card/',
 'https://www.lovecrave.com/shop/leather-case/',
 'https://www.lovecrave.com/shop/vesper-replacement-charger/']

answered Feb 11, 2021 at 19:59

Arundeep Chohan

9,9995 gold badges17 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

undetected Selenium · Accepted Answer · 2021-02-11 19:02:58Z

1

I am making some assumptions about this xpath //p[@class='excerpt'] if below does not work please add an htlm example of the element.

You can get a list of link element by making this update:

perso_flist = driver.find_elements_by_xpath("//li//a[@class='full-link']")

Then loop through the list using element.get_attribute()

listOflinks = []
for i in perso_flist:
    link_1=i.get_attribute("href")
    listOflinks.append(link_1)
print(listOflinks)

edited Feb 11, 2021 at 19:02

undetected Selenium

194k44 gold badges304 silver badges387 bronze badges

answered Feb 11, 2021 at 15:39

Jortega

3,8081 gold badge22 silver badges23 bronze badges

4 Comments

mlosada Over a year ago

Thanks, Jortega. Adding //a my output is still empty. What other xtml element may I use?

Jortega Over a year ago

@mlosada In addition to adding //a also update your loop. I need to know what the output of i.get_attribute('innerHTML') returns. I will also add this line to the answer.

mlosada Over a year ago

My output is a short description of each product taken from the page of products and: [None, None, None, None, None, None, None, None, None, None, None, None] My list of links is still empty.

mlosada Over a year ago

as I said before, applying the updated code, I get the description of the products and my list of links remains empty.

Collectives™ on Stack Overflow

Using selenium in python for capturing the links in a web

2 Answers 2

Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related