2

Possible Duplicate:
get site name from a URL in python

For URLs like this:

http://twitter.com/pypi
http://www.wolframalpha.com/input/?i=python

I'd like to pull out the 'http://twitter.com' or 'http://wolframalpha.com' parts.

The following code works, but I'm looking for suggestions of a cleaner way of doing it...

'/'.join(url.split('/',3)[:3])
1

2 Answers 2

3

You can use the urllib.parse (named urlparse prior to Python 3) module:

>>> from urllib.parse import urlparse
>>> urlparse("http://twitter.com")
ParseResult(scheme='http', netloc='twitter.com', path='', params='', query='', fragment='')
>>> r = urlparse("http://twitter.com")
>>> r.scheme + '://' + r.netloc
'http://twitter.com'
Sign up to request clarification or add additional context in comments.

Comments

0

Another(less readable) method with urlparse:

>>> from urlparse import urlparse, urlunparse
>>> urlunparse(urlparse("http://twitter.com/pypi")[:2] + ("",) * 4)
'http://twitter.com'

Comments