regex expression for extracting url

Question

I have a url: http://example.com/(S(4txk2wasxh3u0slptzi20qyj))/CWC_Link.aspx

but I only want to extract this portion: (S(4txk2anwasxh3u0slptzi20qyj))/

Please, can anyone suggest me regex for this

This might not be a job for regexes, but for existing tools in your language of choice. Regexes are not a magic wand you wave at every problem that happens to involve strings. You probably want to use existing code that has already been written, tested, and debugged. In PHP, use the parse_url function. Perl: URI module. Ruby: URI module. .NET: 'Uri' class — Andy Lester
– Andy Lester, Commented Jul 12, 2013 at 4:27

quetzalcoatl · Accepted Answer · 2013-07-10 07:33:17Z

1

The key point is to notice that the () characters mark the boundaries and that no / character is in the contents:

/(\(S\([^/()]+\)\))/

answered Jul 10, 2013 at 7:33

quetzalcoatl

33.8k9 gold badges75 silver badges117 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Seimen Over a year ago

There's a small mistake here, your first "/" in the regex results in /(S(4txk2anwasxh3u0slptzi20qyj))/.

quetzalcoatl Over a year ago

@Simon: there is no mistake. Those are the bounding characters. Note that there's a capturing group in that expression. After applying the regex you should read the $1 match (first capture, (S(4txk2anwasxh3u0slptzi20qyj))), not $0 (whole match /(S(4txk2anwasxh3u0slptzi20qyj))/). Without that bounding characters, if you pass an url of http://farmer.gov.in/asdada(S(foo))asdasd/(S(key))/asdasdasd you might catch the 'foo' instead of the 'key'. But of course that's anyways so improbable that you can probably safely remove the extra bounding '/'s.

Seimen Over a year ago

Even if the answer is already ticked as solution, there's still a mistake @quetzalcoatl because then your capturing group does not include the forward slash at the end of the string which was described by the OP as desired result. Also you won't catch the foo if you restrict your regex to match for a forward slash after the bracket.

quetzalcoatl Over a year ago

Also you won't catch the foo if you restrict your regex to match for a forward slash - this is exactly why I included a '/' at both sides. Compare that to your regex that is capable of capturing many more false positives. As to the tail, I've completely intentionally left the trailing '/' off the regex, because I take it as a typo on the OP side, because he clearly wanted to catch the 'magic string' from the URL. He didn't complain mind you.

Trogvar · Accepted Answer · 2013-07-10 07:49:53Z

0

Here's your regex. The part in braces will extract needed fragment

/^.+\/([^\/]+)\/.+$/

Basically, the logic is simple: ^ - marks beginning of the string

.+\/ - matches all symbols before the next part. This part of regex is composed taking into account default "greedy" behaviour of regexes, so this part matches http://farmer.gov.in/ in your example

([^\/]+) - matches all symbols between two slashes

\/.+$ - matches all symbols till the end of the string

Example with PHP language:

<?php
$string = "http://farmer.gov.in/(S(4txk2wasxh3u0slptzi20qyj))/CWC_Link.aspx";
$regex = "/^.+\/([^\/]+)\/.+$/";
preg_match($regex, $string, $matches);
var_dump($matches);
?>

In the output $matches[1] will have your needed value (S(4txk2wasxh3u0slptzi20qyj))

edited Jul 10, 2013 at 7:49

answered Jul 10, 2013 at 7:34

Trogvar

8566 silver badges17 bronze badges

Comments

Seimen · Accepted Answer · 2013-07-10 08:40:00Z

0

This regex does the job:

\(.*\)\/

Just match an opening bracket, then anything until a closing bracket with a forward slash.

edited Jul 10, 2013 at 8:40

answered Jul 10, 2013 at 7:58

Seimen

7,2942 gold badges30 silver badges42 bronze badges

Collectives™ on Stack Overflow

regex expression for extracting url

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related