12

I can use urllib2 to make HEAD requests like so:

import urllib2
request = urllib2.Request('http://example.com')
request.get_method = lambda: 'HEAD'
urllib2.urlopen(request)

The problem is that it appears that when this follows redirects, it uses GET instead of HEAD.

The purpose of this HEAD request is to check the size and content type of the URL I'm about to download so that I can ensure that I don't download some huge document. (The URL is supplied by a random internet user through IRC).

How could I make it use HEAD requests when following redirects?

2
  • 3
    Requests at least claims to do this the right way (at least, it documents its redirect behaviour as working for idempotent methods, and calls out HEAD specifically in the docs). Commented Apr 1, 2012 at 19:41
  • a similar solution: stackoverflow.com/questions/9890815/… Commented Apr 1, 2012 at 21:00

2 Answers 2

23

You can do this with the requests library:

>>> import requests
>>> r = requests.head('http://github.com', allow_redirects=True)
>>> r
<Response [200]>
>>> r.history
[<Response [301]>]
>>> r.url
u'https://github.com/'
Sign up to request clarification or add additional context in comments.

Comments

3

Good question! If you're set on using urllib2, you'll want to look at this answer about the construction of your own redirect handler.

In short (read: blatantly stolen from the previous answer):

import urllib2

#redirect_handler = urllib2.HTTPRedirectHandler()

class MyHTTPRedirectHandler(urllib2.HTTPRedirectHandler):
    def http_error_302(self, req, fp, code, msg, headers):
        print "Cookie Manip Right Here"
        return urllib2.HTTPRedirectHandler.http_error_302(self, req, fp, code, msg, headers)

    http_error_301 = http_error_303 = http_error_307 = http_error_302

cookieprocessor = urllib2.HTTPCookieProcessor()

opener = urllib2.build_opener(MyHTTPRedirectHandler, cookieprocessor)
urllib2.install_opener(opener)

response =urllib2.urlopen("WHEREEVER")
print response.read()

print cookieprocessor.cookiejar

Also, as mentioned in the errata, you can use Python Requests.

1 Comment

I ended up using this redirect handler, based on what you found: pastebin.com/m7aN21A7 Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.