Seems that urllib2 sends HTTP/1.1 request by default?
-
Is there any particular reason to use HTTP 1.0 over HTTP 1.1?Waleed Khan– Waleed Khan2012-12-01 05:43:21 +00:00Commented Dec 1, 2012 at 5:43
-
I am also curious why the need for HTTP 1.0Marwan Alsabbagh– Marwan Alsabbagh2012-12-01 06:19:51 +00:00Commented Dec 1, 2012 at 6:19
-
I am writing a test script for one of my stupid homework, which only uses HTTP 1.0. (the test script is not part of the homework)houqp– houqp2012-12-01 18:45:41 +00:00Commented Dec 1, 2012 at 18:45
Add a comment
|
2 Answers
To avoid monkey-patching httplib (global change), you could subclass HTTPConnection and define your own http handler:
#!/usr/bin/env python
try:
from httplib import HTTPConnection
from urllib2 import HTTPHandler, build_opener
except ImportError: # Python 3
from http.client import HTTPConnection
from urllib.request import HTTPHandler, build_opener
class HTTP10Connection(HTTPConnection):
_http_vsn = 10
_http_vsn_str = "HTTP/1.0"
class HTTP10Handler(HTTPHandler):
def http_open(self, req):
return self.do_open(HTTP10Connection, req)
opener = build_opener(HTTP10Handler)
print(opener.open('http://stackoverflow.com/q/13656757').read()[:100])
Comments
urllib2 uses httplib under the hood to make the connection. You can change it to http 1.0 as shown below. I've included my apache servers access log to show how the http connection have change to 1.0
code
import urllib2, httplib
httplib.HTTPConnection._http_vsn = 10
httplib.HTTPConnection._http_vsn_str = 'HTTP/1.0'
print urllib2.urlopen('http://localhost/').read()
access.log
127.0.0.1 - - [01/Dec/2012:09:10:27 +0300] "GET / HTTP/1.1" 200 454 "-" "Python-urllib/2.7"
127.0.0.1 - - [01/Dec/2012:09:16:32 +0300] "GET / HTTP/1.0" 200 454 "-" "Python-urllib/2.7"
2 Comments
houqp
I finally figured out by defining my own handler, but your solution is much simpler, thanks :)
Tsan-Kuang Lee
for python 3 users, httplib is replaced by http.client and all the rest is still the same as Marwan's wonderful solution.