I am accessing an https page through a proxy:
def read_page(self,url):
'''
Gets web page using proxy and returns beautifulsoup object
'''
soup = None
try:
r = requests.get(url, proxies=PROXIES, auth=PROXY_AUTH,
cert = ('../static/crawlera-ca.crt'), verify=False,allow_redirects=False)
except requests.exceptions.MissingSchema:
return False
if r.status_code == 200:
soup = bs4.BeautifulSoup(r.text, "html.parser")
if soup:
return soup
return False
I am passing "https://www.bestbuy.com" as the url. I get this error:
requests.exceptions.SSLError: HTTPSConnectionPool(host='www.bestbuy.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError(336265225, '[SSL] PEM lib (_ssl.c:2964)'),))
When I remove the cert = ('../static/crawlera-ca.crt') argument, the program accesses the site successfully giving me an 'InsecureRequestWarning', which is expected. But I don't understand why the other error happens. The certificate file is in the right place in my folder hierarchy, and was downloaded from the proxy service, so I know it's right.
The easy option would be to just not use the certificate and suppress the security warning, but I want to do it properly. Can anyone explain what is going on and how I can fix it?