How can I find multiple URLs within a string (href attribute)

Question

I've written a script (see here) to get all the URLs from within a template directory, however some of the hrefs contain two URLs to use depending on what language the app runs in.

So my script currently gives me a list of whatever is in href='here', but now I want to also collect the URLs from a href that looks like this;

href="{{ 'http://www.link.com/blah/page.htm'|cy:'http://www.link.com/welsh/blah/page.htm' }}"

What regular expression would I need to return those? (As with so many people, I'm awful at Regex!)

Jon Clements · Accepted Answer · 2013-07-11 08:40:57Z

2

Something like:

href="{{ 'http://www.link.com/blah/page.htm'|cy:'http://www.link.com/welsh/blah/page.htm' }}"

import re
print re.findall("'(http://(?:.*?))'", href)
# ['http://www.link.com/blah/page.htm', 'http://www.link.com/welsh/blah/page.htm']

Takes anything starting with http:// that's inside apostrophes.

answered Jul 11, 2013 at 8:40

Jon Clements

143k34 gold badges254 silver badges288 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Ashwini Chaudhary Over a year ago

+1 You can also add http(s)? to handle both http and https.

Jon Clements Over a year ago

@AshwiniChaudhary yup, or just s? will do it... Suppose it should be up to the OP if they want to handle that/any other protocols...

markwalker_ Over a year ago

Wonderful. I was trying to find by start and end characters. Does re.findall("'(http[s]:// work to match http and https? I've seen the [s] used in an example, but don't fully understand it.

Jon Clements Over a year ago

@marksweb sorry - was out for moment. [s] means match one of the characters inside the [] therefore it would only match https - Using s? means match none or one s... so it matches http or https

Collectives™ on Stack Overflow

How can I find multiple URLs within a string (href attribute)

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related