I'm trying to scrape event details (event name, date, time, and tags) from the Central Park events calendar at https://www.centralparknyc.org/calendar. The website is dynamic, and it seems that the event details are not loading while scraping.
I've attempted to use requests and BeautifulSoup to scrape the content, but I'm encountering a 403 Forbidden error, which I suspect is due to the website's bot protection measures.
Could someone guide me on how to properly scrape this dynamic content using Python? Any advice on handling dynamic content and bot detection would be greatly appreciated.
import requests
from bs4 import BeautifulSoup
url = 'https://www.centralparknyc.org/calendar'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'
}
response = requests.get(url, headers=headers)
if response.status_code == 200:
# Process the page
soup = BeautifulSoup(response.content, 'html.parser')
# ... scraping logic here ...
else:
print(f'Failed to retrieve the webpage: {response.status_code}')
