I am new to Python and am in the process of scraping a site to collect inventory information. The inventory items are spread across 6 pages on the site. The scraping went very smoothly and I was able to parse out all of the HTML elements I wanted to select.
I am now taking this to the next step and trying to export this into a csv file using the csv.writer included in Python 3. The script runs in my command line without any Syntax Errors popping up, but the csv file does not get created. I am wondering if there are any obvious issues with my script or something that I may have left out when attempting to put the parsed HTML elements into a csv.
Here is my code:
import requests
import csv
from bs4 import BeautifulSoup
main_used_page = 'https://www.necarconnection.com/used-vehicles/'
page = requests.get(main_used_page)
soup = BeautifulSoup(page.text,'html.parser')
def get_items(main_used_page,urls):
main_site = 'https://www.necarconnection.com/'
counter = 0
for x in urls:
site = requests.get(main_used_page + urls[counter])
soup = BeautifulSoup(site.content,'html.parser')
counter +=1
for item in soup.find_all('li'):
vehicle = item.find('div',class_='inventory-post')
image = item.find('div',class_='vehicle-image')
price = item.find('div',class_='price-top')
vin = item.find_all('div',class_='vinstock')
try:
url = image.find('a')
link = url.get('href')
pic_link = url.img
img_url = pic_link['src']
if 'gif' in pic_link['src']:img_url = pic_link['data-src']
landing = requests.get(main_site + link)
souped = BeautifulSoup(landing_page.content,'html.parser')
comment = ''
for comments in souped.find_all('td',class_='results listview'):
com = comments.get_text()
comment += com
with open('necc-december.csv','w',newline='') as csv_file:
fieldnames = ['CLASSIFICATION','TYPE','PRICE','VIN',
'INDEX','LINK','IMG','DESCRIPTION']
writer = csv.DictWriter(csv_file,fieldnames=fieldnames)
writer.writeheader()
writer.writerow({
'CLASSIFICATION':vehicle['data-make'],
'TYPE':vehicle['data-type'],
'PRICE':price,
'VIN':vin,
'INDEX':vehicle['data-location'],
'LINK':link,
'IMG':img_url,
'DESCRIPTION':comment})
except TypeError: None
except AttributeError: None
except UnboundLocalError: None
urls = ['']
counter = 0
prev = 0
for x in range(100):
site = requests.get(main_used_page + urls[counter])
soup = BeautifulSoup(site.content,'html.parser')
for button in soup.find_all('a',class_='pages'):
if button['class'] == ['prev']:
prev +=1
if button['class'] == ['next']:
next_url = button.get('href')
if next_url not in urls:
urls.append(next_url)
counter +=1
if prev - 1 > counter:break
get_items(main_used_page,urls)
Here is a screenshot of what happens after the script is being processed through the command line:

It takes a while for the script to run, so I know that the script is being read and processed. I am just unsure what is going wrong between that and actually making the csv file.
I hope this was helpful. Again, any tips or tricks on working with the Python 3 csv.writer would be super appreciated as I have tried multiple different variations.
with open('necc-december.csv','a',newline='')dictionary, after you have all the data call the function write to CVS and write all the data to file