How to parse parameters in a URL Fragment using Python 3

Question

I need to parse a large number of URLs to retrieve a guid using urllib/python3. Some urls contain a fragment which causes problems with returning the parameters.

import urllib

url = "https://zzz.com/index.html#viewer?guid=6a755e6d-4eae&Link=true&psession=true")
parse_response = urllib.parse.urlsplit(self.url)
self.logger.info("The parsed url components = {}".format(parse_response))

The parsed url components = SplitResult(scheme='https', netloc='abc.com', path='/index.html', query='', fragment='viewer?guid=6a755e6d-4eae&Link=true&psession=true')]

So urllib rightly sees the "#" and stores the rest of the URL as a fragment, and will not return the parameters. What is the best way to process the URL's with and without fragments?

Vikas Khengare · Accepted Answer · 2021-08-24 14:33:57Z

2

from urllib.parse import urlparse, urldefrag, parse_qs
url = " https://abc.xyz.com/url/with/fragment/query#param1=val1&param2=val2&param3=val3"

print(urlparse(url))
print(urlparse(url).fragment)

pq = parse_qs(urlparse(url).fragment)
print(pq)
print(type(pq))
print("Using urlparse {}".format((pq["access_token"][0])))


f = urldefrag(url).fragment
print(type(f))
print(f)

pq = parse_qs(f)
print(pq)
print(type(pq))
print("Using urldefrag {}".format((pq["access_token"][0])))

answered Aug 24, 2021 at 14:33

Vikas Khengare

1341 silver badge7 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Leonardo Over a year ago

According to RFC 3986, the fragment starts after the # character, so in your example, param1=val1&param2=val2&param3=val3 is the fragment. parse_qs and parse_qsl were made to parse the query, not the fragment. Your code works for an URI like http://localhost#param1=value1&param2=value1, but fails with http://localhost#param1&value1.

Collectives™ on Stack Overflow

How to parse parameters in a URL Fragment using Python 3

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related