I'm scraping reddit usernames using Python and I'm trying to extract the username from an URL. The URL looks like this:
https://www.reddit.com/user/ExampleUser
This is my code:
def extract_username(url):
start = url.find('https://www.reddit.com/user/') + 28
end = url.find('?', start)
end2 = url.find("/", start)
return url[start:end] and url[start:end2] and url[start:]
The first part works but removing the question mark and forward slash doesen't. Maybe I'm using the "and" keyword wrong? Which means I sometimes get something like this:
ExampleUser/
ExampleUser/comments/
ExampleUser/submitted/
ExampleUser/gilded/
ExampleUser?sort=hot
ExampleUser?sort=new
ExampleUser?sort=top
ExampleUser?sort=controversial
I know I can use the api but i'd like to learn how to do it without. I've also heard about regular expressions but aren't they pretty slow?