1

Not sure how to proceed to allow or prevent snipping at underscore and colon in urls in this preg_replace. Also not sure what other special characters I might be missing that will snip the url

$text = preg_replace_callback('@(https?://([-\w\.]+)+(:\d+)?(/([-\w/_\.]*(\?\S+)?)?)?)@', function($m) {
   return '<a href="' . $m[1] . '" target="_blank">' . substr($m[1], 0, 75) . '</a>';
}, $text);

The text link may appear as https://something.com/something_something:moresomething but it only links the https://something.com/something _something:moresomething. I used both symbols in this as example but it does it individually too.

2
  • 2
    What do you mean by "snip the url"? Can you update the question with example strings that the pattern should match and your desired result or what you want to accomplish? Commented Mar 17, 2023 at 11:08
  • @Thefourthbird When generating a url it does not include anything after an underscore or colon. I will try to show above. Commented Mar 17, 2023 at 18:16

1 Answer 1

0

You are missing a colon in this part [-\w/_\.:]* but as you only use $m[1] in the callback code you can simplify the pattern and omit using capture groups at all and make use of the full match with $m[0]

Note that you don't have to escape the dot \. in a character class, and \w also matches _ so you don't have to add that separately to the character class.

https?://(?:[-\w.]+)+(?::\d+)?(?:/(?:[-\w/.:]*(?:\?\S+)?)?)?

The pattern matches:

  • https?:// Match the protocol with an optional s
  • (?:[-\w.]+)+ Repeat 1+ times matching one of the listed in the character class
  • (?::\d+)? Optionally match : and 1+ digits
  • (?: Non capture group
    • / Match literally
    • (?: Non capture group
      • [-\w/.:]* Optionally repeat matching one of the listed in the character class
      • (?:\?\S+)? Optionally match ? and 1+ non whitespace chars
    • )? Close non capture group and make it optional
  • )? Close non capture group and make it optional

For example

$text = "https://something.com/something_something:moresomething";
$pattern = '@https?://(?:[-\w.]+)+(?::\d+)?(?:/(?:[-\w/.:]*(?:\?\S+)?)?)?@';
$text = preg_replace_callback($pattern, function($m) {
    return '<a href="' . $m[0] . '" target="_blank">' . substr($m[0], 0, 75) . '</a>';
}, $text);

echo $text;

Output

<a href="https://something.com/something_something:moresomething" target="_blank">https://something.com/something_something:moresomething</a>
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.