i am trying to scrape author name and author url from the following webpage.
https://medium.com/javascript-scene/top-javascript-frameworks-and-topics-to-learn-in-2019-b4142f38df20?source=tag_archive
and i am using following code;
author_flag = 0
divs = soup.find_all('h2')
for div in divs:
author = div.find('a')
if(author is not None):
author_art.append(author.text)
author_url.append('https://medium.com'+ author.get('href'))
aurhor_flag = 1
break
if(author_flag==0):
author_art.append('Author information missing')
author_url.append('Author Url information missing')
can anyone take a look what i am doing wrong in this? As this code is not picking anything. its is just returning blank list.
Full code:
import pandas as pd
import requests
from bs4 import BeautifulSoup
import re
data = pd.read_csv('url_technology.csv')
author_art = []
author_url = []
for i in range(1):
try:
author_flag = 0
divs = soup.find_all('meta')
for div in divs:
author = div.find('span')
if(author is not None):
author_art.append(author.text)
author_url.append('https://medium.com'+author.get('href'))
aurhor_flag = 1
break
if(author_flag==0):
author_art.append('Author information missing')
author_url.append('Author Url information missing')
except:
print('no data found')
author_art = pd.DataFrame(title)
author_url = pd.DataFrame(url)
res = pd.concat([author_art, author_art] , axis=1)
res.columns = ['Author_Art', 'Author_url']
res.to_csv('combined1.csv')
print('File created successfully')
https://medium.com/javascript-scene/top-javascript-frameworks-and-topics-to-learn-in-2019-b4142f38df20?source=tag_archive---------0----------------------- https://medium.com/job-advice-for-software-engineers/what-i-want-and-dont-want-to-see-on-your-software-engineering-resume-cbc07913f7f6?source=tag_archive---------1----------------------- https://itnext.io/load-testing-using-apache-jmeter-af189dd6f805?source=tag_archive---------2----------------------- https://medium.com/s/story/black-mirror-bandersnatch-a-study-guide-c46dfe9156d?source=tag_archive---------3----------------------- https://medium.com/fast-company/the-worst-design-crimes-of-2018-56f32b027bb7?source=tag_archive---------4----------------------- https://towardsdatascience.com/make-your-pictures-beautiful-with-a-touch-of-machine-learning-magic-31672daa3032?source=tag_archive---------5----------------------- https://medium.com/hackernoon/the-state-of-ruby-2019-is-it-dying-509160a4fb92?source=tag_archive---------6-----------------------