So I tried getting all the headlines of the New York Times homepage and wanted to see how many times a certain word has been mentioned. In this particular case, I wanted to see how many headlines mentioned either the Coronavirus or Trump. This is my code but it won't work as 'number' remains the integer I give it before the while loop.
import requests
from bs4 import BeautifulSoup
url = 'https://www.nytimes.com'
r = requests.get(url)
soup = BeautifulSoup(r.text, "html.parser")
a = soup.findAll("h2", class_="esl82me0")
for story_heading in a:
print(story_heading.contents[0])
lijst = ["trump", "Trump", "Corona", "COVID", "virus", "Virus", "Coronavirus", "COVID-19"]
number = 0
run = 0
while run < len(a)+1:
run += 1
if any(lijst in s for s in a)
number += 1
print("\nTrump or the Corona virus have been mentioned", number, "times.")
So I basically want the variable 'number' to increase by 1 if a headline (which is an entry in the list a) has the word Trump or Coronavirus or both in them.
Does anyone know how to do this?
collections.Counterwould do the trick for you.