I am web-scraping lot of pdfs of committee meetings off a local government website. (https://www.gmcameetings.co.uk/) Therefore there are links.. within links... within links. I can successfully scrape all the 'a' tags from the main area of the page (the ones that I want), but when I try and scrape anything within them I get the error in the title of the question: AttributeError: ResultSet object has no attribute 'find'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()? How do I fix this?
I am completely new to coding and started an internship yesterday for which I am expected to web-scrape this information. The woman I'm supposed to be working with is not here for another couple of days and nobody else can help me - so please bear with me and be kind as I am a complete beginner and doing this alone. I know I have set up the first part of the code correctly as I can download the the whole page or download any particular links. Again, it's when I try and scrape within the links I have already (and successfully scraped) that I get the above error message. I think (with the little knowledge I know) that it's because of the 'output' of the 'all_links' which comes out as below. I have tried both find() and findAll() which both result in the same error message.
#the error message
date_links_area = all_links.find('ul',{"class":"item-list item-list--
rich"})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\rache\AppData\Local\Programs\Python\Python37-32\lib\site-
packages\bs4\element.py", line 1620, in __getattr__
"ResultSet object has no attribute '%s'. You're probably treating a list
of items like a single item. Did you call find_all() when you meant to
call
find()?" % key
AttributeError: ResultSet object has no attribute 'find'. You're probably
treating a list of items like a single item. Did you call find_all() when
you meant to call find()?
#output of all_links looks like this (this is only part of it)
href="https://www.gmcameetings.co.uk/info/20180/live_meetings/199/membership_201819">Members of the GMCA 2018/19, Greater Manchester Combined Authority Constitution, Meeting papers,
Some of those links then go to a page that has a list of dates - which is the area of the page I'm trying to get to. Then within that area I need to get the links with the dates. Then within them I need to grab the pdfs I want. Apologies if this doesn't make sense. I'm trying my best to do this on my own with zero experience.