47

How can we use PHP to identify URL's in a string and store them in an array?

Cannot use the explode function if the URL contains a comma, it wont give correct results.

2
  • See also stackoverflow.com/a/11588614/1066234 Commented May 23, 2020 at 14:48
  • 2
    preg_match_all("/\b((https?):\/\/)?([a-z0-9-.]*)\.([a-z]{2,3})([-A-Z0-9+&@#\/%?=~_|$!:,.;]*[A-Z0-9+&@#\/%=~_|$])/i", $string, $match); use this one Commented Nov 22, 2021 at 7:22

6 Answers 6

131

REGEX is the answer for your problem. Taking the Answer of Object Manipulator.. all it's missing is to exclude "commas", so you can try this code that excludes them and gives 3 separated URL's as output:

$string = "The text you want to filter goes here. http://google.com, https://www.youtube.com/watch?v=K_m7NEDMrV0,https://instagram.com/hellow/";

preg_match_all('#\bhttps?://[^,\s()<>]+(?:\([\w\d]+\)|([^,[:punct:]\s]|/))#', $string, $match);

echo "<pre>";
print_r($match[0]); 
echo "</pre>";

and the output is

Array
(
    [0] => http://google.com
    [1] => https://www.youtube.com/watch?v=K_m7NEDMrV0
    [2] => https://instagram.com/hellow/
)
Sign up to request clarification or add additional context in comments.

10 Comments

Maybe you'd want to make it case-insensitive by adding the i modifier. ie. ...#i'
Just a note, some URLs use commas in their query strings
@aampudia: Very good approach. But is there a simple way to find urls without protocol, too? Like: "The text you want to filter goes here. www.google.de, www.youtube.com".
Note that url's don't always include http or https, since they can also begin with only //.
@shyammakwana.me you have to delete the [:punct:] part of the regular expression, that tells it to ingore all punctuation, if you remove that it will take the underscore at the end
|
9

please try to use below regex

$regex = '/https?\:\/\/[^\",]+/i';
preg_match_all($regex, $string, $matches);
echo "<pre>";
print_r($matches[0]); 

Hope this will work for you

1 Comment

this query is 'greedy' when the url's are not separated by a comma.
5

You can try Regex here:

$string = "The text you want to filter goes here. http://google.com, https://www.youtube.com/watch?v=K_m7NEDMrV0,https://instagram.com/hellow/";

preg_match_all('#\bhttps?://[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/))#', $string, $match);

echo "<pre>";
print_r($match[0]); 
echo "</pre>";

This gives the following output:

Array
(
  [0] => http://google.com
  [1] => https://www.youtube.com/watch?v=K_m7NEDMrV0,https://instagram.com/hellow/
)

2 Comments

it should have 3 results in the output array. not 2. http://google.com ,https://www.youtube.com/watch?v=K_m7NEDMrV0 and https://instagram.com/hellow/
[\w\d]+ === [\w]+
5

try this

function getUrls($string)
{
$regex = '/https?\:\/\/[^\" ]+/i';
preg_match_all($regex, $string, $matches);
return ($matches[0]);
}
$urls = getUrls($string);
print_r($urls);

or

$str = '<a href="http://foobar.com"> | Hello world Im a http://google.fr |     Did you mean:http://google.fr/index.php?id=1&b=6#2310';
$pattern = '`.*?((http|ftp)://[\w#$&+,\/:;[email protected]]+)[^\w#$&+,\/:;[email protected]]*?`i';
if (preg_match_all($pattern,$str,$matches)) 
{
print_r($matches[1]);
}

it will works

4 Comments

No, still its giving 2 results. there are 3 URL's but only 2 is returned. can u see? Array ( [0] => http://google.com, [1] => https://www.youtube.com/watch?v=K_m7NEDMrV0,https://instagram.com/hellow/ )
can u provide a example with that regex?
no it doesnt work for my string. $string = "The text you want to filter goes here. http://google.com, https://www.youtube.com/watch?v=K_m7NEDMrV0,https://instagram.com/hellow/";
4
$urlstring = "The text you want to filter goes here. http://google.com, https://www.youtube.com/watch?v=K_m7NEDMrV0,https://instagram.com/hellow/";

preg_match_all('#\bhttps?://[^,\s()<>]+(?:\([\w\d]+\)|([^,[:punct:]\s]|/))#', $urlstring , $result);

print_r($result[0]); 

1 Comment

no, it still gives only 2 URL's. it should give 3 URL's as a result.
2
$string = "The text you want to filter goes here. http://google.com,
https://www.youtube.com/watch?v=K_m7NEDMrV0,https://instagram.com/hellow/";

preg_match_all('#\bhttps?://[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/))#',
$string, $match);

echo "<pre>"; $arr = explode(",", $match[0][1]);
print_r($match[0][0]); print_r($arr); echo "</pre>";

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.