0

I am trying to load "Show More" on a website automatically using Selenium and then want to scrape the content using Beautifulsoup.

My code is running but not giving desired results. I know that I am doing something wrong but can't locate it. For Selenium: My code is clicking the "Show More" button but it is not consistent. As sometimes it clicks 5 times and sometimes 10 times. I want it to run until the last "Show More". I don't understand what I am doing wrong. For Beautifulsoup: Along with loading the Show More, I want to scrape the title of each article but my code is stopping only after the first click.

import time

import requests

from bs4 import BeautifulSoup

from selenium import webdriver

base = "https://www.nytimes.com"

browser = webdriver.Safari(executable_path = '/usr/bin/safaridriver')

browser.get('https://www.nytimes.com/search?endDate=20190331&query=cybersecurity&sort=newest&startDate=20180401')

soup = BeautifulSoup(browser.page_source,'lxml')

for link in soup.select(".css-138we14 a"):
    resp = requests.get(base + link.get("href"))
    sauce = BeautifulSoup(resp.text, "lxml")
    title = sauce.select_one("h1.css-1j5ig2m.e1h9rw200").text
    print(title)

    while True:
        try:
            show_more = browser.find_element_by_xpath('//button[@type="button"][contains(.,"Show More")]').click()
        except Exception as e:
            print(e)
            break

print("Complete")

time.sleep(10)

browser.quit()

As I mentioned I want the code to run till the last "Show More" button and I want to scrape the title of all the articles (335 articles in total).

1
  • side note but a wait clickable condition might be useful Commented May 22, 2019 at 8:27

1 Answer 1

1

As stated, you may want to have it wait for the clickable element:

So something like this:

import time
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

base = "https://www.nytimes.com"
browser = webdriver.Chrome('C:/chromedriver_win32/chromedriver.exe')
wait = WebDriverWait(browser, 10)
browser.get('https://www.nytimes.com/search?endDate=20190331&query=cybersecurity&sort=newest&startDate=20180401')

while True:
    try:
        time.sleep(1)
        show_more = wait.until(EC.element_to_be_clickable((By.XPATH, '//button[@type="button"][contains(.,"Show More")]')))  
        show_more.click()
    except Exception as e:
            print(e)
            break    

soup = BeautifulSoup(browser.page_source,'lxml')
search_results = soup.find('ol', {'data-testid':'search-results'})

links = search_results.find_all('a')
for link in links:
    title = link.find('h4').text
    date = link.find_next('time').text
    print(date + ': '+ title)

print("Complete")

browser.quit()

Output:

March 31: Bezos’ Security Consultant Accuses Saudis of Hacking the Amazon C.E.O.’s Phone
March 29: In Ukraine, Russia Tests a New Facebook Tactic in Election Tampering
March 29: Huawei Shrugs Off U.S. Clampdown With a $100 Billion Year
March 28: N.S.A. Contractor Arrested in Biggest Breach of U.S. Secrets Pleads Guilty
March 28: Grindr Is Owned by a Chinese Firm, and the U.S. Is Trying to Force It to Sell
March 28: DealBook Briefing: Saudi Arabia Wanted Cash. Aramco Just Obliged.
March 28: Huawei Security ‘Defects’ Are Found by British Authorities
March 25: As Special Counsel, Mueller Kept Such a Low Profile He Seemed Almost Invisible
March 22: Quotation of the Day: In New Age of Digital Warfare, Spies for Any Nation’s Budget
March 22: Coast Guard’s Top Officer Pledges ‘Dedicated Campaign’ to Improve Diversity
March 21: A New Age of Warfare: How Internet Mercenaries Do Battle for Authoritarian Governments
March 21: Facebook Did Not Securely Store Passwords. Here’s What You Need to Know.
March 18: Homeland Security Chief Cites Top Threat to U.S. (It’s Not the Border)
March 18: Nielsen Warns Against ‘Cyberthugs and Hackers’
March 17: U.S. Campaign to Ban Huawei Overseas Stumbles as Allies Resist
March 13: Vietnam’s Communist Party Ousts Historian Who Criticized Its China Policy
March 11: With Trump’s Budget Out, Democrats Must Now Show Their Cards
March 10: U.S. and China Near Currency Deal, but Provisions May Not Be New
March 8: Facebook Announces Plan to Curb Vaccine Misinformation
March 7: DealBook Briefing: Facebook Prioritizes Privacy. Can It Deliver?
March 7: Locking More Than the Doors as Cars Become Computers on Wheels
March 7: Huawei Sues U.S. Government Over What It Calls an Unfair Ban
March 6: Trump’s 5G Plan Is More Than a Gift to His Base
March 4: China, Huawei, Michael Jackson: Your Tuesday Briefing
March 4: Alphabet’s Security Start-Up Wants to Offer History Lessons
March 4: Huawei Said to Be Preparing to Sue the U.S. Government
March 4: Venezuela, India, North Korea: Your Monday Briefing
March 3: As Trump and Kim Met, North Korean Hackers Hit Over 100 Targets in U.S. and Ally Nations
March 2: Who’s Investigating Justin Trudeau — and What Do They Hope to Find?
March 1: The Week in Tech: How Can America Make the World Shun Huawei?
March 1: After Unpredictable Trump Meeting, Kim Returns to Scripted Form in Vietnam
Feb. 27: As Huawei’s Influence in Canada Grows, Some Fear Spying. Others Just Want Fast Internet.
Feb. 26: Was Russia Treason Trial About U.S. Election Meddling or a Convict’s Revenge?
Feb. 26: U.A.E. to Use Equipment From Huawei Despite American Pressure
Feb. 22: The Week in Tech: Chinese and Iranian Hackers Have Returned
Feb. 22: The Media Is Not the Enemy
Feb. 21: How Israel’s Moon Lander Got to the Launchpad
Feb. 20: Huawei Risks to Britain Can Be Blunted, U.K. Official Says, in a Rebuff to U.S.
Feb. 20: Russian Hackers Targeted European Research Groups, Microsoft Says
Feb. 18: Australia’s Prime Minister Blames ‘Sophisticated State Actor’ for Parliament Hack
Feb. 18: Chinese and Iranian Hackers Renew Their Attacks on U.S. Companies
Feb. 14: Can Berkeley Boycott Amazon?
Feb. 13: The Strange Experience of Being Australia’s First Tech Billionaires
Feb. 13: Turkey, Huawei, Migration: Your Wednesday Briefing
Feb. 12: Huawei Was a Czech Favorite. Now? It’s a National Security Threat.
Feb. 12: Hong Kong, North Korea, U.S.-China Trade: Your Wednesday Briefing
Feb. 11: DealBook Briefing: Brace for Another Government Shutdown
Feb. 10: These 50 Start-Ups May Be the Next ‘Unicorns’
Feb. 10: India, Jeff Bezos, Grammys: Your Monday Briefing
Feb. 8: Huawei Threatens Lawsuit Against Czech Republic After Security Warning
Feb. 8: DealBook Briefing: Jeff Bezos, Blackmail and ‘Below the Belt’ Selfies
Feb. 7: Key Senator Warns of Dangers of Chinese Investment in 5G Networks
Feb. 4: How to Safeguard Your Tech, and Your Money, While Traveling
Jan. 31: Russia’s Playbook for Social Media Disinformation Has Gone Global
Jan. 31: Securing Our Data
Jan. 30: Learning With: ‘In 5G Race With China, U.S. Pushes Allies to Fight Huawei’
Jan. 29: Cybersecurity, Polar Vortex, Kamala Harris: Your Tuesday Evening Briefing
Jan. 29: No People. No Process. No Policy.
Jan. 28: The Case of the Bumbling Spy: A Watchdog Group Gets Him on Camera
Jan. 28: Two-Factor Authentication Might Not Keep You Safe
Jan. 27: Another Side of #MeToo: Male Managers Fearful of Mentoring Women
Jan. 27: In 5G Race With China, U.S. Pushes Allies to Fight Huawei
Jan. 25: The Week in Tech: Silicon Valley Hobnobs in Davos
Jan. 23: World Leaders at Davos Call for Global Rules on Tech
Jan. 23: Lessons for Corporate Boardrooms From Yahoo’s Cybersecurity Settlement
Jan. 22: Did Australia Hurt Phone Security Around the World?
Jan. 22: How Huawei Wooed Europe With Sponsorships, Investments and Promises
Jan. 21: If 5G Is So Important, Why Isn’t It Secure?
Jan. 18: D.N.C. Says It Was Targeted Again by Russian Hackers After ’18 Election
Jan. 17: Facebook Identifies Russia-Linked Misinformation Campaign
Jan. 17: Only One House Republican Represents the Borderland, and He Opposes a Wall
Jan. 15: Hacker for Hire
Jan. 11: E.T.F.s Try to Lure Investors Into Ever Narrower Niches
Jan. 11: Poland Arrests 2, Including Huawei Employee, Accused of Spying for China
Jan. 11: El Chapo Trial: Why His I.T. Guy Had a Nervous Breakdown
Jan. 9: A Border Wall to Stop Terrorists? Experts Say That Makes Little Sense
Jan. 8: DealBook Briefing: A Model to Alleviate Student Debt Gains Traction
Jan. 8: German Man Confesses to Hacking Politicians’ Data, Officials Say
Jan. 8: No Tuition, but You Pay a Percentage of Your Income (if You Find a Job) 
Jan. 7: Democrats Faked Online Push to Outlaw Alcohol in Alabama Race
Jan. 6: Who Wants a Market Downturn? These Investors Actually Do
Jan. 5: Is America’s Political Future in San Antonio?
Jan. 4: Marriott Concedes 5 Million Passport Numbers Lost to Hackers Were Not Encrypted
Jan. 4: Hackers Leak Details of German Lawmakers, Except Those on Far Right
Jan. 3: Devices That Will Invade Your Life in 2019 (and What’s Overhyped)
Jan. 2: Why the World Needs America and China to Get Along
Jan. 2: DealBook Briefing: What Could Go Wrong in 2019? Plenty
Dec. 27, 2018: LinkedIn Co-Founder Apologizes for Deception in Alabama Senate Race
Dec. 27, 2018: Our Cellphones Aren’t Safe
Dec. 21, 2018: In 2018, Did Business Get Too Big?
Dec. 21, 2018: The Week in Tech: Hostages in the U.S.-China Tech Cold War
Dec. 20, 2018: U.S. Accuses Chinese Nationals of Infiltrating Corporate and Government Technology
Dec. 19, 2018: Google’s Marketing of Children’s Apps Misleads Parents, Consumer Groups Say
Dec. 19, 2018: ‘I Can English Understand,’ New Official Says. The Swiss Have Their Doubts.
Dec. 19, 2018: DealBook Briefing: Inside Facebook’s Huge Data Giveaway to Its Big Tech Brethren
Dec. 18, 2018: Michael Flynn, Shutdown, China Trade: Your Tuesday Evening Briefing
Dec. 18, 2018: How You Can Help Fight the Information Wars
Dec. 18, 2018: President Xi, K-Pop, Huawei: Your Wednesday Briefing
Dec. 18, 2018: DealBook Briefing: Did Big Tech Lie to Congress About Russian Interference?
Dec. 18, 2018: Russian Trolls Came for Instagram, Too
Dec. 18, 2018: Sprint, T-Mobile Deal Gets Green Light From U.S. Regulators
Dec. 18, 2018: Yes, Russian Trolls Helped Elect Trump
Dec. 18, 2018: Facebook, Twitter and YouTube Withheld Russia Data, Reports Say
Dec. 17, 2018: What We Now Know About Russian Disinformation
Dec. 17, 2018: Five Takeaways From New Reports on Russia’s Social Media Operations
Dec. 17, 2018: How to Make the Trade War Even Worse
Dec. 17, 2018: Voter Suppression and Racial Targeting: In Facebook’s and Twitter’s Words
Dec. 17, 2018: Russian 2016 Influence Operation Targeted African-Americans on Social Media
Dec. 12, 2018: Cohen Sentencing, Brexit, China Trade: Your Wednesday Evening Briefing
Dec. 12, 2018: Theresa May, China, Michael Cohen: Your Thursday Briefing
Dec. 12, 2018: DealBook Briefing: How Trump Plans to Keep China In Line on Trade
Dec. 12, 2018: China Says Detained Canadian Worked for Group Without Legal Registration
Dec. 11, 2018: Marriott Data Breach Is Traced to Chinese Hackers as U.S. Readies Crackdown on Beijing
Dec. 7, 2018: The Week in Tech: Facebook Is in the News. Again.
Dec. 7, 2018: U.S.-China Friction Threatens to Undercut the Fight Against Climate Change
Dec. 6, 2018: Teenagers in The Times: November 2018
Dec. 5, 2018: Rudy Giuliani Says Twitter Sabotaged His Tweet. Actually, He Did It Himself.
Dec. 4, 2018: House Republican Campaign Committee Says It Was Hacked This Year
Dec. 3, 2018: Kicked Out of Port Authority, Bieber Bus Got a Prime Stop on a Crowded Curb
Nov. 30, 2018: G-20, Marriott, Immigration: Your Friday Evening Briefing
Nov. 30, 2018: Marriott Hacking Exposes Data of Up to 500 Million Guests
Nov. 29, 2018: DealBook Briefing: The Fed’s Chair Sent the Markets Soaring
Nov. 29, 2018: N.Y. Today: Trump vs. Cuomo, Not So Much
Nov. 29, 2018: After a Hiatus, China Accelerates Cyberspying Efforts to Obtain U.S. Technology
Nov. 28, 2018: Iranians Accused in Cyberattacks, Including One That Hobbled Atlanta
Nov. 28, 2018: A Plan to Turn New York Into a Capital of Cybersecurity
Nov. 22, 2018: Time to Make the Donates!
Nov. 22, 2018: How Facebook’s P.R. Firm Brought Political Trickery to Tech
Nov. 21, 2018: Manufacturers Remain Slow to Recognize Cybersecurity Risks
Nov. 20, 2018: A Perfect Target for Cybercriminals 
Nov. 19, 2018: DealBook Briefing: Nissan’s Chairman Faces Criminal Charges Over Secret Compensation
Nov. 16, 2018: Justin Trudeau’s Official Fixer-Upper
Nov. 16, 2018: What Facebook Knew and Tried to Hide
Nov. 16, 2018: Brexit, Macedonia, Facebook: Your Friday Briefing
Nov. 15, 2018: Brexit, Saudi Arabia, Chinese Hospitals: Your Friday Briefing
Nov. 15, 2018: Minister in Charge of Japan’s Cybersecurity Says He Has Never Used a Computer
Nov. 14, 2018: Learning to Attack the Cyberattackers Can’t Happen Fast Enough
Nov. 14, 2018: How Do You Get Students to Think Like Criminals?
Nov. 13, 2018: Georgia’s Shaky Voting System
Nov. 13, 2018: DealBook Briefing: WeWork Might Be Too Big to Fail
Nov. 11, 2018: How a Former Canadian Spy Helps Wall Street Mavens Think Smarter
Nov. 11, 2018: This Week’s Wedding Announcements
Nov. 11, 2018: Ioanna Kefalas, Alexander Niejelow
Nov. 8, 2018: DealBook Briefing: Why Corporate America Is Content With the Midterms
Nov. 7, 2018: The Mad Dash to Find a Cybersecurity Force
Nov. 7, 2018: Russian Trolls Were at It Again Before Midterms, Facebook Says
Nov. 7, 2018: Antonio Delgado Upsets John Faso as 3 House Republicans Fall to N.Y. Democrats
Nov. 6, 2018: Russians Meddling in the Midterms? Here’s the Data
Nov. 6, 2018: Georgia Governor’s Race Is Hurtling Toward Election Day, and Passions Are Rising
Nov. 4, 2018: Consulting Firms Keep Lucrative Saudi Alliance, Shaping Crown Prince’s Vision
Nov. 1, 2018: Mystery of the Midterm Elections: Where Are the Russians?
Nov. 1, 2018: ‘I Am Not an Internet Troll’
Oct. 30, 2018: Chinese Military May Gain From Western University Ties, Report Warns
Oct. 25, 2018: 4 Women Try to Unseat House Republicans in N.Y.; Donors and Celebrities Take Notice
Oct. 24, 2018: Workforce Trends Impacting Deals: Are You Ready?
Oct. 23, 2018: Hack of Saudi Petrochemical Plant Was Coordinated From Russian Institute
Oct. 23, 2018: U.S. Begins First Cyberoperation Against Russia Aimed at Protecting Elections
Oct. 22, 2018: Trump May Revive the Cold War, but China Could Change the Dynamics
Oct. 22, 2018: DealBook Briefing: It’s Tough to Quit Saudi Arabia
Oct. 21, 2018: This Week’s Wedding Announcements
Oct. 21, 2018: Elena Welt, Jason Burke
Oct. 20, 2018: America’s Elections Could Be Hacked. Go Vote Anyway.
Oct. 19, 2018: Saudi Arabia Says Jamal Khashoggi Was Killed in Consulate Fight
Oct. 19, 2018: Five Artificial Intelligence Insiders in Their Own Words
Oct. 16, 2018: Why It’s So Hard to Punish Companies for Data Breaches
Oct. 15, 2018: IBM Takes Cybersecurity Training on the Road
Oct. 15, 2018: A Genocide Incited on Facebook, With Posts From Myanmar’s Military
Oct. 12, 2018: U.S. Stocks Became Expensive. Are Other Countries Better Bets?
Oct. 12, 2018: Facebook Hack Included Search History and Location Data of Millions
Oct. 11, 2018: Internet Hacking Is About to Get Much Worse
Oct. 10, 2018: New U.S. Weapons Systems Are a Hackers’ Bonanza, Investigators Find
Oct. 10, 2018: DealBook Briefing: Sears May Be on the Brink of Bankruptcy
Oct. 9, 2018: She’s a Gun-Owning Democrat. Her Opponent Calls Her an Extreme Liberal.
Oct. 8, 2018: Google Plus Will Be Shut Down After User Information Was Exposed
Oct. 8, 2018: The S.E.C. Dusts Off a Never-Used Cyber Enforcement Tool
Oct. 8, 2018: Australia Should Reverse Its Huawei 5G Ban
Oct. 6, 2018: Hackers, Good and Bad
Oct. 5, 2018: Cybersecurity Risks Should Weigh on Investors’ Minds More Often
Oct. 5, 2018: Will China Hack the U.S. Midterms?
Oct. 4, 2018: Kavanaugh, China, the Nobel Peace Prize: Your Friday Briefing
Oct. 3, 2018: Setting Up Your Tech on the Assumption You’ll Be Hacked
Oct. 3, 2018: DealBook Briefing: How Trump Reaped Riches From His Father
Oct. 2, 2018: Trump’s Reckless Cybersecurity Strategy
Sept. 30, 2018: This Week’s Wedding Announcements
Sept. 30, 2018: Jennifer Berry, Travis Jarae
Sept. 28, 2018: Facebook Security Breach Exposes Accounts of 50 Million Users
Sept. 27, 2018: Your Thursday News Briefing: Child Poverty, Brett Kavanaugh, United Nations
Sept. 26, 2018: Our Investigative Reporters Explain the Trump-Russia Story 
Sept. 26, 2018: DealBook Briefing: Trump Rails Against Globalism
Sept. 26, 2018: Brett Kavanaugh, Bill Cosby, Dunkin’ Donuts: Your Wednesday Briefing
Sept. 26, 2018: The Crisis of Election Security
Sept. 25, 2018: Is a New Russian Meddling Tactic Hiding in Plain Sight?
Sept. 24, 2018: When Reporting on Defcon, Avoid Stereotypes and A.T.M.s
Sept. 22, 2018: For Hackers, Anonymity Was Once Critical. That’s Changing.
Sept. 22, 2018: Billionaire Backer of Maria Butina Had Russian Security Ties
Sept. 21, 2018: Tran Dai Quang, Hard-Line Vietnamese President, Dies at 61
Sept. 21, 2018: DealBook Briefing: Does Bank of America Care About Investment Banking?
Sept. 20, 2018: The Plot to Subvert an Election: Unraveling the Russia Story So Far
Sept. 20, 2018: The Plot to Subvert an Election: Unraveling the Russia Story So Far
Sept. 19, 2018: Inside Facebook’s Election ‘War Room’
Sept. 17, 2018: Can Ethiopia’s New Leader, a Political Insider, Change It From the Inside Out?
Sept. 10, 2018: Role Models Tell Girls That STEM’s for Them in New Campaign
Sept. 7, 2018: A Security Expert Tied to WikiLeaks Vanishes, and the Internet Is Abuzz
Sept. 5, 2018: AnchorFree, Maker of a Top Online Privacy App, Raises $295 Million
Sept. 5, 2018: ‘Five Eyes’ Nations Quietly Demand Government Access to Encrypted Data
Sept. 4, 2018: Australia Wants to Take Government Surveillance to the Next Level
Aug. 31, 2018: Once Bipartisan, an Election Security Bill Collapses in Rancor
Aug. 29, 2018: The Fourth Season of ‘Mr. Robot’ Will Be Its Last
Aug. 28, 2018: In Melbourne tech firms take the first crack at tomorrow
Aug. 28, 2018: Corrections: August 28, 2018
Aug. 26, 2018: This Week’s Wedding Announcements
Aug. 26, 2018: Evita Almassi, Christopher Main
Aug. 25, 2018: For a Working-Mom Reporter, ‘The Juggle’ Is Real
Aug. 24, 2018: The Week in Tech: Democracy Under Siege
Aug. 24, 2018: California Today: A Rare Look Inside Steve Jobs’s Family
Aug. 23, 2018: Jeff Sessions, Hawaii, Reality Winner: Your Thursday Evening Briefing
Aug. 23, 2018: Malcolm Turnbull, Trade War, Amazon Tribe: Your Friday Briefing
Aug. 23, 2018: Google Deletes 39 YouTube Channels Linked to Iranian Influence Operation
Aug. 23, 2018: Attempted Hacking of Voter Database Was a False Alarm, Democratic Party Says
Aug. 23, 2018: Paul Manafort, Hawaii, Urban Meyer: Your Thursday Briefing
Aug. 23, 2018: How FireEye Helped Facebook Spot a Disinformation Campaign
Aug. 22, 2018: Democratic Party Says It Thwarted Attempted Hack of Voter Database
Aug. 22, 2018: Donald Trump, Duncan Hunter, Hawaii: Your Wednesday Briefing
Aug. 22, 2018: Facebook Identifies New Influence Operations Spanning Globe
Aug. 21, 2018: New Russian Hacking Targeted Republican Groups, Microsoft Says
Aug. 17, 2018: The Week in Tech: When to Tweet
Aug. 15, 2018: Hold the Phone! My Unsettling Discoveries About How Our Gestures Online Are Tracked
Aug. 14, 2018: Uber Picks N.S.A. Veteran to Fix Troubled Security Team
Aug. 13, 2018: Tesla Board Surprised by Elon Musk’s Tweet on Taking Carmaker Private
Aug. 11, 2018: Brian Kemp, Enemy of Democracy 
...
Sign up to request clarification or add additional context in comments.

6 Comments

I have given +1 as is only half working. The selenium part is working ss it is clicking all the Show More button. However, there is still some issue in beautifulsoup part as it is not scraping all the title. It is only scraping and printing until the first Show More button. What is the problem?
hmm. not sure I'll look it at now
Thanks, chitown88. Is it possible to get the "full article" also along with date and title? I really do want to ask as you have already been a great help but I tried but couldn't do it. I have just started python for the first time (Not a programmer). It would be great if you can help me. If not then also it is alright. Thanks again.
I'd imagine it is possible. I'll give it look in a moment and get back to you.
@PiyushGhasiya Ask as a new question. I have the solution for you. Let me know when you put the question up
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.