I need help on removing duplicate URLs in my output. I would try to represent it such that I don't have to put everything in a list, if possible. I feel like it can be achieved with some logical statement, just not sure how to make it happen. Using Python 3.6.
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
from urllib.parse import urljoin as join
my_url = 'https://www.census.gov/programs-surveys/popest.html'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
filename = "LinkScraping.csv"
f = open(filename, "w")
headers = "Web_Links\n"
f.write(headers)
links = page_soup.findAll('a')
for link in links:
web_links = link.get("href")
ab_url = join(my_url, web_links)
print(ab_url)
if ab_url:
f.write(str(ab_url) + "\n")
f.close()