Fetch only specific links using selenium in python

Question

I am trying to fetch the links of all news articles related to Apple, using this webpage: https://finance.yahoo.com/quote/AAPL/news?p=AAPL. But there are also a lot of links for advertisements in between and other links guiding to other pages of the website. How do I selectively only fetch links to news articles? Here is the code I have written so far:

driver = webdriver.Chrome(executable_path='C:\\Users\\Home\\OneDrive\\Desktop\\AJ\\chromedriver_win32\\chromedriver.exe')
driver.get("https://finance.yahoo.com/quote/AAPL/news?p=AAPL")
links=[]
for a in driver.find_elements_by_xpath('.//a'):
    links.append(a.get_attribute('href'))

def get_info(url):
    #send request   
    response = requests.get(url)
    #parse    
    soup = BeautifulSoup(response.text)
    #get information we need
    news = soup.find('div', attrs={'class': 'caas-body'}).text
    headline = soup.find('h1').text 
    date = soup.find('time').text
    return news, headline, date

Can anyone guide on how to do this or to a resource that can help with this? Thanks!

May be useful: .//a[starts-with(@href,"/news") or starts-with(@href,"/m")]. You can learn XPath syntax. — Trock
– Trock, Commented Sep 18, 2021 at 7:31

pmadhu · Accepted Answer · 2021-09-18 08:17:59Z

1

Try this xpath to get all the news links from that page.

//li[contains(@class,'js-stream-content')]/div[@data-test-locator='mega']//h3/a

driver.implicitly_wait(10)
driver.maximize_window()

driver.get("https://finance.yahoo.com/quote/AAPL/news?p=AAPL")
time.sleep(10)
links = driver.find_elements_by_xpath("//li[contains(@class,'js-stream-content')]/div[@data-test-locator='mega']//h3/a")
for link in links:
    print(link.get_attribute("href"))

answered Sep 18, 2021 at 8:17

pmadhu

3,4332 gold badges14 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Fetch only specific links using selenium in python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related