Trying to use requests to download a list of urls and catch the exception if it is a bad url. Here's my test code:
import requests
from requests.exceptions import ConnectionError
#goodurl
url = "http://www.google.com"
#badurl with good host
#url = "http://www.google.com/thereisnothing.jpg"
#url with bad host
#url = "http://somethingpotato.com"
print url
try:
r = requests.get(url, allow_redirects=True)
print "the url is good"
except ConnectionError,e:
print e
print "the url is bad"
The problem is if I pass in url = "http://www.google.com" everything works as it should and as expected since it is a good url.
http://www.google.com
the url is good
But if I pass in url = "http://www.google.com/thereisnothing.jpg"
I still get :
http://www.google.com/thereisnothing.jpg
the url is good
So its almost like its not even looking at anything after the "/"
just to see if the error checking is working at all I passed a bad hostname: #url = "http://somethingpotato.com"
Which kicked back the error message I expected:
http://somethingpotato.com
HTTPConnectionPool(host='somethingpotato.com', port=80): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1b6cd15b90>: Failed to establish a new connection: [Errno -2] Name or service not known',))
the url is bad
What am I missing to make request capture a bad url not just a bad hostname?
Thanks