0

I'm having trouble retrieving a URL parameter from a string using regular expressions:

An example string could be

some text and http://google.com/?something=this&tag=yahoo.com and more text, and I would like to be able to find yahoo.com from this.

The caveat is that I need to ensure that the string begins with http://google.com, and not just search for &tag=(.*)

preg_match("/google\.com\/.*&tag=(.*) $/", $subject, $matches)

i'm hoping this matches anything with google.com followed by anything, followed by &tag= followed by a space. Ultimately the goal is to parse out all of the tag= values from google.com URLs.

Is there a better way to accomplish this?

Update:

so I have this new regex: /google\.com\/.*(tag=.*)/ but i'm not sure how to get it to end on a space after the URL

4
  • What's the issue with your code? Commented Aug 7, 2013 at 22:25
  • Why the space at the end of your pattern? Commented Aug 7, 2013 at 22:26
  • I'm hoping to match the end of the string with a space... (i should probably add $) Commented Aug 7, 2013 at 22:27
  • Do not hope. Create lit of URLs that you want to pass, other list of invalid ones and write unit test to check if your function does what you want it to. Commented Aug 7, 2013 at 22:31

2 Answers 2

4

get friendly with the parse_url() function!

$pieces = parse_url('some text http://google.com/?something=this&tag=yahoo.com and whatever');
$query = explode('&', $pieces['query']);

parse_str($pieces['query'], $get);
array_walk($get, function(&$item){
    if (!$sp = strpos($item, ' ')) return;
    $item = substr($item, 0, $sp);
});

var_dump($get); // woo!

edit: thanks to Johnathan for the parse_str() function.

Sign up to request clarification or add additional context in comments.

3 Comments

thanks, never used that - but the string in this example will be more text than just the URL
Don't forget parse_str to parse the query string.
@d-_-b, theres no problem that theres extra text, parse_url pulls it out.
1

If you want to get the value of tag then the following regex will do the job:

$string = 'some text and http://google.com/?something=this&tag=yahoo.com
and more text
http://google.com/?something=this&tag=yahoo2.com&param=test
';
preg_match_all('#http://google.com\S+&tag=([^\s&]+)#', $string, $m);
print_r($m[1]);

Output

Array
(
    [0] => yahoo.com
    [1] => yahoo2.com
)

Explanation

  • http://google.com : match http://google.com
  • \S+ : match non whitespace one or more times
  • &tag= : match &tag=
  • ([^\s&]+) : match anything except whitespace and & one or more times and group it

If you want, you may even add s? after http to take in account for https, or add the i modifier to match case insensitive.

2 Comments

this is great! thanks! I'll review this in a little while... quick question, why the # at the start?
@d-_-b It's the modifier. I used # instead of / which you're using. Why ? Because if I use a different modifier than / then I wouldn't need to escape those slashes for example in http:// :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.