0

I'm new to python and i'm trying to achieve a web scraping project. I was following a tutorial and got stuck in the part that i pass the data to a csv sheet. I already tried to move some brackets and other structures, but nothing seems to help. See the code attached Thanks for the help, i've been stuck for a couple of hours.

An obs: The command "Dataframe" did not change color, i don't if this makes any difference, but it's good to mention.

import bs4
from bs4 import BeautifulSoup
import pandas
import selenium
from selenium import webdriver
import pandas as pd


products=[] #List to store name of the product
prices=[] #List to store price of the product
ratings=[] #List to store rating of the product
driver = webdriver.Chrome(executable_path = r'C:\Users\directory\Desktop\chromedriver.exe')
driver.get("https://www.flipkart.com/laptops/~buyback-guarantee-on-laptops-/pr?sid=6bo%2Cb5g&uniq")
content = driver.page_source
soup = BeautifulSoup(content, 'html.parser')
for a in soup.findAll('a',href=True, attrs={'class':'_31qSD5'}):
    name=a.find('div', attrs={'class':'_3wU53n'})
    price=a.find('div', attrs={'class':'_1vC4OE _2rQ-NK'})
    rating=a.find('div', attrs={'class':'hGSR34 _2beYZw'})
products.append(name.text)
prices.append(price.text)
ratings.append(rating(("dd").text)

df = pd.Dataframe(data= {'Product Name': products,'Price': prices,'Rating':ratings})
df.to_csv('products.csv', index=False, encoding='utf-8')

The error:

df = pd.Dataframe(data= {'Product Name': products,'Price': prices,'Rating':ratings})
 ^
SyntaxError: invalid syntax'''


2
  • Where is pd defined ? you're probably missing import pandas as pd Commented May 24, 2020 at 13:22
  • Its not pd.Dataframe, it is pd.DataFrame Commented May 24, 2020 at 13:36

3 Answers 3

1

There are several flaws in your code, ranging from not importing the libraries in correct format and the most important one is in the for loop. As per your code, the items are added to the list outside the for loop, which may not work for all items. Second issue is, that after you save the data in dictionary format then you cant simply create a csv file off the same. Try the following code:

from bs4 import BeautifulSoup
import pandas as pd
from selenium import webdriver
import pandas as pd

products=[] #List to store name of the product
prices=[] #List to store price of the product
ratings=[] #List to store rating of the product
driver = webdriver.Chrome(executable_path = r'C:\Users\directory\Desktop\chromedriver.exe')
driver.get("https://www.flipkart.com/laptops/~buyback-guarantee-on-laptops-/pr?sid=6bo%2Cb5g&uniq")
content = driver.page_source
soup = BeautifulSoup(content, 'html.parser')
for a in soup.findAll('a',href=True, attrs={'class':'_31qSD5'}):
    name=a.find('div', attrs={'class':'_3wU53n'})
    price=a.find('div', attrs={'class':'_1vC4OE _2rQ-NK'})
    rating=a.find('div', attrs={'class':'hGSR34'})
    products.append(name.text)
    prices.append(price.text)
    ratings.append(rating.text)
    data = dict({'Product Name': products,
                 'Price': prices,
                 'Rating':ratings
                 })
    # create dataframe
    products_df = pd.DataFrame(
        dict([(k, pd.Series(v)) for k, v in data.items()])
        )
    products_df.to_csv('products.csv', sep=",")

Results

,Product Name,Price,Rating 0,Apple MacBook Air Core i5 5th Gen - (8 GB/128 GB SSD/Mac OS Sierra) MQD32HN/A A1466,"₹65,990",4.7 1,Lenovo Ideapad Core i5 7th Gen - (8 GB/1 TB HDD/Windows 10 Home/2 GB Graphics) IP 320-15IKB Laptop,"₹51,990",4.3 2,HP 15 Core i3 6th Gen - (4 GB/1 TB HDD/Windows 10 Home) 15-be014TU Laptop,"₹36,163",4.1 3,Lenovo Core i5 7th Gen - (8 GB/2 TB HDD/Windows 10 Home/4 GB Graphics) IP 520 Laptop,"₹79,500",4.4 4,Lenovo Core i5 7th Gen - (8 GB/1 TB HDD/DOS/2 GB Graphics) IP 320-15IKB Laptop,"₹56,990",4.3

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you so much for the kind explanation. mnm solve it. Just one quick question, do you know where i can find the csv file generated ?
no problem. Change the last line of code to something like, products_df.to_csv("../../data/jobs_df.csv", sep=',') where in the first argument you specify the path to save the data file. Hope its helpful.
0

Use the library in this way. first import pandas must to delete.

import pandas as pd

1 Comment

Already deleted "import pandas", change to "DataFrame"and unfortonelly nothing happens.
0

Python objects are case sensitive:

Use pd.DataFrame instead of pd.Dataframe

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.