2

I wrote a code in python using 'requests' and 'beautifulSoup' api to scrape text data from first 100 sites return by google. Well it works good on most of sites but it is giving errors on those which are responding later or not responding at all I am getting this error

raise MaxRetryError(_pool, url, error or ResponseError(cause)) requests.packages.urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='www.lfpress.com', port=80): Max retries exceeded with url: /2015/11/06/fair-with-a-flare-samosas-made-easy (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 11001] getaddrinfo failed',))

Am I supposed to change code written inside requests API? Or I need to use some proxy? How can I leave that site and move on to next one? As error is stopping my execution.

1
  • 1
    try:.. except: pass ? Commented Jan 2, 2016 at 22:13

1 Answer 1

2

Add a "try except" block around your call to catch that exception and continue if you don't care about the error like:

import requests
try:
    requests.get('http://stackoverflow.com/')
except requests.packages.urllib3.exceptions.MaxRetryError as e:
    print repr(e)
Sign up to request clarification or add additional context in comments.

3 Comments

Well thanks, How can I avoid all exceptions present in requests.packages.urllib3.exceptions? Not just MaxRetryError?
@MuhammadZeeshan That's called passive error handling. Use except alone without specifying.
To expand ^, you can write except Exception as e: smth smth e

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.