0

I've got this regex pattern from WMD showdown.js file.

/<((https?|ftp|dict):[^'">\s]+)>/gi

and the code is:

text = text.replace(/<((https?|ftp|dict):[^'">\s]+)>/gi,"<a href=\"$1\">$1</a>");

But when I set text to http://www.google.com, it does not anchor it, it returns the original text value as is (http://www.google.com).

P.S: I've tested it with RegexPal and it does not match.

6
  • 2
    Take the <> out, it should work This one looks to be the best: (http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&amp;:/~\+#]*[\w\-\@?^=%&amp;/~\+#])? From regexlib.com/… Commented Aug 22, 2011 at 21:06
  • 1
    The last time someone answered a question about regex and HTML it drove them mad. stackoverflow.com/questions/1732348/… Commented Aug 22, 2011 at 21:08
  • So you just want to take the whole url and put it in an anchor tag? In your example it should return <a href="http://www.google.com">http://www.google.com</a>? Commented Aug 22, 2011 at 21:12
  • @Ali, Yes that is what I wanted. Commented Aug 22, 2011 at 21:38
  • There are many more protocols than the 3 listed, are those the only ones you want? And you are creating links, not anchors. Commented Aug 22, 2011 at 23:48

3 Answers 3

2

Your code is searching for a url wrapped in <> like: <http://www.google.com>: RegexPal.

Just change it to /((https?|ftp|dict):[^'">\s]+)/gi if you don't want it to search for the <>: RegexPal

Sign up to request clarification or add additional context in comments.

Comments

0

As long as you know your url's start with http:// or https:// or whatever you can use:

/((https?|s?ftp|dict|www)(://)?)[A-Za-z0-9.\-]+)/gi

The expression will match till it encounters a character not allowed in the URL i.e. is not A-Za-z\.\-. It will not however detect anything of the form google.com or anything that comes after the domain name like parameters or sub directory paths etc. If that is your requirement that you can simply choose to terminate the terminating condition as you have above in your regex.

I know it seems pointless but it may be useful if you want the display name to be something abbreviated rather than the whole url in case of complex urls.

2 Comments

There are lots of other characters that are valid in a URL, pretty much anything other than a space is allowed.
Ignoring internationalized domain names... no, basically only A-Za-z0-9\- are allowed in domain names the - cannot be leading or the last character. LordCover (asker) is from Syria so it's really up to him I guess to decide what works. Either way, this regex is only useful for extracting the domain name which wasn't the requirement to start with. (Look at Valid characters en.wikipedia.org/wiki/Domain_name)
0

You could use:

var re = /(http|https|ftp|dict)(:\/\/\S+?)(\.?\s|\.?$)/gi;

with:

 el.innerHTML = el.innerHTML.replace(re, '<a href=\'$1$2\'>$1$2<\/a>$3');

to also match URLs at the end of sentences.

But you need to be very careful with this technique, make sure the content of the element is more or less plain text and not complex markup. Regular expressions are not meant for, nor are they good at, processing or parsing HTML.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.