0

I am using a regex to try and validate URL's. The regex I have works very well but the only issue is that it validates URL's even if there is no http:// in the front. I want it to only validate if the URL has http:// in the front (even if it doesn't contain a www right after)

This is the regex I'm using:

((https?)\:\/\/)?([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?([a-z0-9-.]*)\.([a-z]{2,3})(\:[0-9]{2,5})?(\/([a-z0-9+\$_-]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?
4
  • Don't validate URLs yourself. Whatever language you are writing in undoubtedly has code that has already been written, tested and debugged. What language are you using? Commented Dec 21, 2012 at 2:38
  • PHP, I've used filter_var('http://example.com', FILTER_VALIDATE_URL) but it validates urls such as http://example.c and http://example.comxx which should not validate. What native PHP function would you suggest? Commented Dec 21, 2012 at 17:22
  • Why do you think those two shouldn't validate? Commented Dec 21, 2012 at 17:29
  • Because they aren't actual URL's? the '.com' part of the URL usually range from 2-4 chars. I know ICANN introduced some new domain extensions the past few years which can have up to 8 chars, but I do not think there are domain extensions of 1 char. Though I could be wrong I'm basing this stuff offa wikipedia. Commented Dec 21, 2012 at 23:58

2 Answers 2

4

Remove the second ? from the left. It's acting as a quantifier to make the whole http section optional.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks! For the lazy: this is the regex I'm using: ((https?)\:\/\/)([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?([a-z‌​0-9-.]*)\.([a-z]{2,3})(\:[0-9]{2,5})?(\/([a-z0-9+\$_-]\.?)+)*\/?(\?[a-z+&\$_.-][a‌​-z0-9;:@&%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?
0

Does making just the (s) optional work?

(http(s)?\:\/\/)?([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?([a-z0-9-.]*)\.([a-z]{2,3})(\:[0-9]{2,5})?(\/([a-z0-9+\$_-]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?

After seeing the new answer, this would require http or https:

(http(s)?\:\/\/)([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?([a-z0-9-.]*)\.([a-z]{2,3})(\:[0-9]{2,5})?(\/([a-z0-9+\$_-]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?

1 Comment

So you're just copying my answer then?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.