I seem to get an error while I use a for loop in my web scraping technique.
Here is my code for the app.py file:
page_content = requests.get("http://books.toscrape.com/").content
parser = BookParser(page_content)
containers = parser.Content()
results = []
for container in containers:
name = container.getName()
link = container.getLink()
price = container.getPrice()
rating = container.getRating()
results.append({'name': name,
'link': link,
'price': price,
'rating': rating
})
print(results[4])
and this is the code for the function that is called:
class BookParser(object):
RATINGS = {
'One': 1,
'Two': 2,
'Three': 3,
'Four': 4,
'Five': 5
}
def __init__(self, page):
self.soup = BeautifulSoup(page, 'html.parser')
def Content(self):
return self.soup.find_all("li",attrs={"class": 'col-xs-6'})
def getName(self):
return self.soup.find('h3').find('a')['title']
def getLink(self):
return self.soup.find('h3').find('a')['href']
def getPrice(self):
locator = BookLocator.PRICE
price = self.soup.select_one(locator).string
pattern = r"[0-9\.]*"
validator = re.findall(pattern, price)
return float(validator[1])
def getRating(self):
locator = BookLocator.STAR_RATING
rating = self.soup.select_one(locator).attrs['class']
rating_number = BookParser.RATINGS.get(rating[1])
return rating_number
and finally, this is the error:
Traceback (most recent call last):
File "c:\Users\Utkarsh Kumar\Documents\Projects\milestoneP4\app.py", line 13, in <module>
name = container.getName()
TypeError: 'NoneType' object is not callable
I don't seem to understand why is the getName() function returning a None Type.
Any help will be highly appreciated as I am pretty new to web scraping
PS: Using it without the for loop just works fine
something like this:
name = parser.getName()
print(name)
parser.getName()will returnA Light in the Atticevery time, which is probably not what you want. In fairness, it's a pretty cool book though.