0

I am trying to implement program that determine, if page support tls or not and if it is needed to have prefix www. So I am testing page1.cz and check response status of this objects:

Session().get('http://page1.cz')
<Response [200]>
Session().get('http://www.page1.cz')
<Response [200]>
Session().get('https://page1.cz')
<Response [200]>
Session().get('https://www.page1.cz')
<Response [200]>

It works fine, I know that page1.cz is using https and it is always redirect to https://page1.cz. When I tried page2.cz, I recieved error when testing with https prefix. I receiving this error:

Session().get('http://page2.cz')
<Response [200]>
Session().get('http://www.page2.cz')
<Response [200]>
Session().get('https://page2.cz')
ConnectionError: HTTPSConnectionPool(host='page2.cz', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f75e85f03c8>: Failed to establish a new connection: [Errno 111] Connection refused',))
Session().get('https://www.page2.cz')
ConnectionError: HTTPSConnectionPool(host='www.page2.cz', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f75e85f03c8>: Failed to establish a new connection: [Errno 111] Connection refused',))

I know that second page does not support https but why that error? It should just return code 4xx or am I wrong? What am I doing wrong and how to check if page support http, https and www prefixes?

4
  • You made too many requests and got banned from accessing the website. It is a policy matter and has nothing to do with HTTPS or w/e. Commented Aug 23, 2018 at 22:10
  • @DYZ It's not a matter of being banned, it's the exception raised when a connection is refused Commented Aug 23, 2018 at 22:13
  • But it is refused because the OP got banned. Commented Aug 23, 2018 at 22:14
  • I tried it once and recieved this error. Now, after one hour I am receiving the same error. @newbie am I understand right that the connection refused is because it does not support https? Commented Aug 23, 2018 at 22:19

1 Answer 1

1

The error says that the host refused the connection and an error is raised.

You can handle the exception using a try-except block.

import requests

try:
    req = requests.get(your_website)
except requests.exceptions.ConnectionError:
    print("Connection refused")

Additionally you can set a timeout for the request, e.g.,

req = requests.get(your_website, timeout=1)

Consider for instance the following website http://www.qq.com/ that does not support https.

With your_website being http://www.qq.com/ you would receive a 200 OK, while with your_website being https://www.qq.com/ an exception is raised.

Sign up to request clarification or add additional context in comments.

3 Comments

So that means, everything works as it should? I except that it will return error code other than 2xx (e.g. 404). I know I could hande excepion, but it looks very unpractical solution. Is there a better way how to check if website support https?
So it is normal behaviour to raise an exception when https is not supported?
Note that you could also receive it as a response when HTTP is not supported and a redirect has not being implemented in the server.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.